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RESEARCH IN CLINICAL PSYCHOLOGY: 1953 
WILLIAM SCHOFIELD 


University of Minnesota 


This is the fifth annual review of research papers appearing in six journals 
which appear to carry the bulk of such literature in the clinical field. These journals 
are: Journal of Clinical Psychology, Journal of Consulting Psychology, Journal of 
Abnormal and Social Psychology, Journal of Applied Psychology, Journal of General 
Psychology, and Journal of Psychology. As in previous years, the same definition of 
research was applied in culling these journals; papers were included in the survey if 
they presented “‘systematic investigation of a specifically described group of sub- 
jects and the derivation of normative or comparative data from psychometrics, 
case histories, or therapeutic interviews; or analysis of administration, scoring, and 
interpretation of a given instrument.” @® 

The distribution of the 186 papers found in the six journals for 1953 is reported 
in Table 1 according to the major research areas represented. Comparable data for 
1952 are also reported. The apparent stability of the total volume of research for 
the past two years requires comment. The 181 studies of 1952 constituted a better 
than 25% increase over the total for the preceding year. °*) Approximately a third 
of this increase was accounted for by a special supplement to the Journal of Ab- 


TABLE 1. DistrrBuTION oF 186 ResEarcH Strupies Reportep In Srx SELECTED JOURNALS IN 1953, 
BY AREAS OF RESEARCH REPRESENTED, WITH COMPARATIVE Data FOR 1952 








No. of % of % of 1952 Rank 
Area Studies Total Total 1952 


Validity (projective techniques) 38 
Normative study (personality) 
Normative (projective techniques) 
Normative study (intelligence) 

Validity (structured personality tests) 
Objective evaluation of therapy 
Intertest relationships 

Validity (prognostic indicators) 

Test standardization 

Normative (structured personality tests) 
New test (projective) 

Validity of psychiatric diagnosis 
Experimental studies of anxiety 
Analysis of recorded interviews 
Physiological studies 

Validity, W-B diagnostic patterns 
Abbreviated intelligence tests 

New tests (intelligence) 

Detection of malingering 

Miscellaneous 
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normal and Social Psychology. Inasmuch as there was no such supplement for 1953, 
the total of 186 studies for the past year suggests a continuing increase in the re- 
search activities of clinicians. 

It is necessary to recognize, however, that changes in editorial policy of certain 
of the journals have resulted in the acceptance for publication of a larger number of 
shorter papers. While emphasis on brevity in research reports is a reasonable ap- 
proach to reduction of publication lag and increase in number of studies published, 
such emphasis can entail distinct loss in scholarly communication. The deletion of 
information pertinent to certain niceties of design or experimental procedure may 
not only interfere with thorough appraisal of a research but may more generally 
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discourage a critical approach to the reading of research reports. The proper balanc- 
ing of considerations for adequate scientific communication and reasonably prompt 
publication of all deserving papers presents a knotty problem which appears not as 
yet to have received a satisfactory solution. 

The relative stability of the pattern of research activity in clinical psychology 
over the past five years is indicated by Table 2, which presents the rank-order cor- 
relation coefficients for the frequencies of major types of research study in con- 
secutive years. These coefficients suggest that the relative emphases in research 
first noticed in 1951 have remained essentially the same in the last two years. 


TABLE 2. RANK-ORDER CORRELATIONS BETWEEN CONSECUTIVE YEARS FOR THE 
FREQUENCIES OF RESEARCHES IN CERTAIN AREAS OF CLINICAL PSYCHOLOGY 








Years 1949-50 1950-51 1951-52 1952-53 
rho +.681 +.543 + .838 +.821 








As revealed by Table 1, the most frequent type of research in 1953 is that con- 
cerned with validation of projective devices. Twenty per cent of all the studies in 
the survey fall in this category which has accounted for an average of 19.2% of all 
the studies in the years 1949-1952, inclusive. This is the fifth consecutive year that 
this category has topped the list of researches. 

Of the 38 studies in Category #1 of Table 1, validation of projective techniques, 
the largest number, fifteen, is concerned with the Rorschach. Nine of these fifteen 
studies are essentially validity studies, concerned with experimental verification of 
basic Rorschach concepts. Allen, Stiff, and Rosenzweig“? add a further study to the 
growing literature which suggests invalidity of the fundamental role assigned to the 
assumed influence of color as a determinant of Rorschach protocols. Having pre- 
viously failed to establish evidence for any consistency in the influence of color on 
the protocols of normals, in this study the authors replicate a balanced design in- 
volving use of a standard and achromatic Rorschach series with ten (!) V. A. psy- 
choties and eight (!) V. A. neurotics. The test-retest protocols, obtained over an 
interval of nine months, were analyzed with respect to a check-list of ten of the 
more common “color shock” indices. Appropriate tests revealed no reliable dif- 
ferences between the chromatic and achromatic series in either of the samples, and 
the authors conclude that presence or absence of color shows the same failure to in- 
fluence Rorschach protocols of neurotics and psychotics as has been previously 
found for normals. These conclusions must necessarily be held cautiously in view 
of the very small Ns of this study, but the findings are in agreement with those 
of comparable recent researches. 

Gibby, Miller, and Walker“® report a research concerned with the examiner’s 
influence on Rorschach protocols. Using records obtained from a sample of V. A. 
mental hygiene clinic patients by twelve examiners, analysis was made of the 
reliability of inter-examiner differences for the frequencies of major Rorschach 
variables. Parallel analyses were made for absolute score values and for scores 
expressed as percentages of the total records. In both cases, reliable differences were 
found for certain scoring variables, the F and C scores being prime examples. A 
tabulation of the cases assigned to the various examiners revealed no variations in 
diagnostic representations which would account for the score differences. 

Maradie“* used a Latin square design with 22 volunteer student nurses to 
study the influence of order of presentation of the Rorschach cards on productivity. 
Ten distinct sequences were represented, with each of the cards being preceded by 
and followed by every other card just once. Subjects were randomly assigned to 
each of the card orders. Replication was achieved by random assignment of an addi- 
tional S to each of the ten orders. Appropriate tests for homogeneity of variance 
were made and analysis of variance was applied to test the effect of order of pre- 
sentation, of specific cards, etc. Later appearing cards elicited reliably more res- 
ponses than earlier appearing cards, and certain cards (e.g., #10), regardless of 
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their order of appearance or sequential relationships, received larger numbers of 
responses than others. The author interprets his findings to suggest that increased 
productivity on the later cards of the standard sequence may be independent of the 
color variable and simply a positional effect. Specific test of this hypothesis by use 
of an achromatic Rorschach series in the same design is planned by Maradie. 

In a study concerned with the possible relationship of verbal fluency and in- 
telligence on Rorschach performance, Lotsof“’) factored the intercorrelation matrix 
for the major Rorschach scoring variables obtained from the protocols of 30 college 
students. Verbal fluency measures were obtained from the recorded verbal descrip- 
tions given by each § for five hypothetical situations; these measures included total 
word counts and numbers of adjectives and verbs. These measures were included in 
the intercorrelation matrix. The factor analysis yielded four factors identified as: 
verbal intelligence, productivity, elaboration, and individuality. Implications for 
Rorschach theory of certain of the intercorrelations found are discussed relative to 
the —— of independence, e.g., M and sum C yielded a product-moment 
‘tp ” oO + 54 

Williams and Lawrence“ in another factorial study sought to check Witten- 
born’s experimental studies of an “‘intellectual factor” among several of the standard 
Rorschach scores. A Thurstone centroid factorization with orthogonal rotation was 
applied to the intercorrelation matrix provided by the Rorschachs and W-B IQs 
of 86 psychiatric patients randomly selected from a large Army hospital. Five factors 
were extracted which were considered to be confirmatory of Wittenborn’s earlier 
work; these were identified as: productivity, movement, lack of perceptual control, 
shading, and intelligence. The ‘intelligence factor” is considered to support clinical 
impressions concerning Rorschach measures of intellect in that high loadings appear 
for the two Wechsler scale IQs and for the W, M, F, FC, and R variables. The 
authors do not comment on the equally high loading of Dd on the intelligence 
factor, nor do they remark on the essentially equal appearance of Dd, M, and FC 
on both the intelligence and “movement” factors. 

A study of the validity of Wittenborn’s “lack of perceptual control” (LPC) 
score was reported by Fabrikant“. Using two samples of 32 psychoneurotic vet- 
erans each, the Rorschach was administered and re-administered, with a two week- 
interval intervening. One group had the same instructions for both administra- 
tions; the other sample, prior to retest, was given directions intended to maximize 
changes i in responses. The LPC score was determined for all records and the means 
for test and retests of each group computed. No report is given of the variances nor 
is there any mention that they were tested for homogeneity. A t-test for reliability 
of the differences between the means was computed. No mention is made of the 
correlation between the test and retest LPC scores, and it appears that erroneous 
use was made of the f-test for uncorrelated means. Since it is probable that the test- 
retest correlation for the LPC score would be of some magnitude, the standard error 
of the mean differences, properly computed, would be considerably smaller than 
those obtained by Fabrikant. While the absolute values of the mean differences 
appear so small as to make it unlikely that even the smaller, correct standard errors 
would yield significant t-values, the necessary data to make this check are not re- 
ported. At any rate, Fabrikant’s conclusion that the Wittenborn-Mettler hypotheses 
concerning LPC did not apply to his samples is not supported. In this context, it is 
appropriate to call attention to the critical note of Stanley and Diller®” in which 
they review general and specific consequences of failure to include the covariance 
term in applying Student’s test to paired scores. 

Barrell ®) developed a detailed system for classification of M responses on the 
Rorschach and analyzed the relationships between the various M responses (in the 
records of 121 male, beginning graduate students in clinical psychology) and ratings 
of certain intellectual and emotional variables made by the Michigan assessment 
project staff. For 23 of the 33 subcategories of M response, the interscorer reliabilities 
satisfied the 1% level of confidence. Using appropriate tests of the degree of associa- 
tion or correlation between the various Rorschach movement categories and ratings 
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on four intellectual variables and eleven emotional variables, Barrell concluded that 
there were differential and reliable relationships between the Rorschach variables 
and intellectual function but that the Rorschach sub-categories and the various 
ratings of emotional behavior were unrelated. 

A paper which brings interesting reflections on the current status of clinical 
diagnostic testing and its rationale is one by Guertin and Trembath“ in which they 
explore the potential of Card VI as a stimulus to responses which might distinguish 
sex offenders from a control group matched for chronological age. The offenders were 
63 hospitalized deviates with histories of indecent exposure, indecent liberties with 
minors, oral perversions, and homosexuality. The controls were state hospital 
employees who were examined routinely as a part of personnel selection. In an initial 
analysis, one of the authors attempted to identify the control or offender member- 
ships of pairs of Card VI protocols, each pair composed of a protocol from the de- 
viate and control samples respectively, matched for age and with randomization 
of the temporal order of presentation of the two records. The Card VI responses 
were read to this author together with quantitative summaries of the entire record. 
His identification was 2.4°%> better than chance! A second analysis consisted of 
quantitative tabulation of the occurrence of nineteen purported clinical indicators 
of disturbance in the Card VI responses of the two groups and use of chi-square to 
test for independence. None of these indicators yielded reliable differentiation of 
the two groups. The authors are not content to simply state their negative findings 
but rather feel compelled to rationalize them by questioning the validity of their 
criterion groups. They go on to suggest, ‘“The Rorschach is a very sensitive instru- 
ment and it reflects psychosexual immaturity too well. The sex offender cannot be 
detected by the presence of psychosexual disturbance because this is so common an 
occurrence.” Thus, @ posteriori they explain their failure to find distinctions on the 
basis of an assumption, by implication a priori, that the distinctions do not exist, 
i.e., ‘‘ ‘Normals’ all demonstrate various aspects and degree (sic) of psychosexual 
maldevelopment.”’ If this be so, how explain that some persons offend and others do 
not? According to Guertin and Trembath, the differences are to be found in “the 
subtle controls of human action.”’ They do not suggest that the Rorschach, in addi- 
tion to being too sensitive to certain variables, may fail to reveal or predict these 
subtle controlling factors. This would seem to be an example of the all too frequent, 
strained logic for the negative instance which is offered by clinicians motivated to the 
defense of a deification. 

Wertheimer “? reports a validity study of the ‘‘eye’’ content response to the 
Rorschach as an indicator of a paranoid component. Using Ns of 25-30, with essent- 
ially equal sex representation, for each of eight diagnostic groups (including neuro- 
tics, paranoids, schizophrenics, and organics), count was made of each occurrence 
of the word eyes in the complete Rorschach protocols of 119 state hospital patients. 
These counts were converted to proportions of the total R for each card and for the 
total record. ‘Paranoid patients did not produce significantly more eye content 
responses than did other diagnostic groups, nor did patients whose symptoms in- 
clude suspiciousness produce more eye content response than did patients whose 
symptoms did not include suspiciousness.” 

In addition to the above nine validation studies of basic hypotheses, the fifteen 
Rorschach studies of Category #1, include three studies of the validity of various 
“signs’’ and patterns. Berkowitz and Levine tested the differentiating power of 
the nine Miale-Harrower-Erickson scores on two samples of V. A. patients: 25 
neurotics and 25 schizophrenics, each group randomly selected from open and closed 
wards respectively. Only one of the nine categories commonly accepted as neurotic 
indicators yielded a statistically reliable differentiation of the two groups. Two 
other scoring categories, F+% and P, yielded reliable differences between the 
neurotics and psychotics. Since the actual distribution of these scores is not shown 
nor the per cent of overlap for the two groups reported, the clinical usefulness of 
these scores as diagnostic differentiators is uncertain. 
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A comparable study was carried out by Reiman) on samples of 50 neurotics 
and 50 schizophrenics, all male World War II veterans. A total of 86 different 
Rorschach variables was tested by means of chi-square to determine the independ- 
ence of the distributions in the two samples. Variables meeting the 1% level of 
confidence were then tested in a replication using new samples of the same size and 
identical procedures. Only five elements of the 86 showed reliable differentiation 
at the 5% level in both studies. Reiman concluded that the “schizophrenic” signs 
he found were not sufficiently reliable to support individual diagnosis. 

Rubin and Lonstein “ attempted a cross-validation of diagnostic patterns for 
schizophrenia suggested by Thiesen. They report the frequency of each of Thiesen’s 
five patterns in a sample of 42 male schizophrenics and compare their frequencies 
with Thiesen’s. While Thiesen reported one or more of his patterns occurring in 
48°% of his sample, Rubin and Lonstein obtained only a 16% incidence. 

The remaining three studies of the Rorschach which fall in Category #1 are 
studies of stress. Berger“) administered the Rorschach to two groups of tuberculous 
patients, forty in each group, matched for age, sex, race, marital status, education, 
and diagnostic classification, with each group retested after an interval of six weeks. 
The experimental group was tested in the first day of admission to the TB hospital 
or sanitarium, and the control group had been hospitalized at least six months at the 
time of first testing. Berger assumed that the fact of initial hospitalization would 
present a real-life situation of greater stress than would characterize the situation 
of patients who had had six months in which to adjust and, further, that the experi- 
mental group would show greater reduction in stress from test to retest. Sixteen 
Rorschach variables commonly considered as anxiety indicators were tested for the 
reliability of the differences between the test-retest changes of the two groups. 
Berger interprets twelve of the fifteen variables to reveal statistically reliable differ- 
ences between the two groups. It is impossible to find support for these findings in 
the statistical data he reports. For thirteen of the fifteen variables tested, the 
standard error of the difference is larger than the reported difference between the 
mean change rate of the two groups. The actual t-values, which Berger does not 
report, range from .028-1.408, and only two of the t-tests exceed 1.00. It is con- 
sequently impossible to comprehend the conclusion that, “Other than for minor 
discrepancies, the results obtained were consistant with anxiety signs used in clinical 
practice.” 

Wishner “® used a sample of eleven male, anxiety neurotics and ten normals to 
make an extensive study of physiological measures as related to Rorschach variables. 
His physiological variables, which were measured before and after administration 
of the Rorschach, included two muscle action potentials (MAP), an EEG reading 
from a left occipital monopolar lead, galvanic skin responses (GSR), respiration, and 
blood pressure under various experimental conditions. Correlations were computed 
between the physiological measures and thirteen Rorschach variables. The nature 
of the coefficients reported is not specified although they were based on Ns of 21 
and are probably Pearsonian. Of the 104 coefficients computed, only six met the 
5% level of confidence and only three the 1% level. None of the Rorschach scores 
differentiated between the two groups. The neurotics were characterized by higher 
heart and respiration rates, the normals by higher GSRs and MAPs from the eye- 
brows. Two groups of relationships were found between Rorschach scores and 
physiological measures: The sum of standard scores for A%, T/R, and F% cor- 
related with the sum of standard scores for respiration rate and number of masseter 
contractions. Similar sums of M, sum Y, and sum C correlated positively with the 
sums of GSR and frontal muscle potentials. Wishner interprets these two constella- 
tions as measuring respectively the degree of diffuse (D) versus focal (F) discharge 
of tension in his subjects and found a correlation of +.66 between D-F measures 
computed from the Rorschach and physiological measures respectively. On the 
basis of these findings, he suggests the hypothesis that “psychopathology i is charact- 
erized by a preponderance of diffuse, unfocused activity.” There is obvious need for 
extensive cross-validity studies of Wishner’s experiment with larger samples. 
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Finally, a replication of Williams’ pioneer experimental study of the validity of 
Rorschach indices of intellectual control is reported by Carlson and Lazarus“, 
They followed the procedures of the earlier experiment exactly and used comparable 
experimental and control groups of 25 and 15 subjects respectively. Whereas Will- 
iams obtained a multiple R of +.824 between a measure of performance decrement 
under stress.and the Rorschach variables of Sum C/Total C plus F+%, the com- 
parable value obtained by Carlson and Lazarus was only +.39. The correlation be- 
tween Sum C /Total C and stress decrement obtained in the replication approached 
statistical significance (-.37) but was opposite in sign to that found by Williams 
(+.35). The relationship of F+% and stress decrement was +.16 as compared 
with Williams’ value of -.61. The authors review their findings in the light of other 
comparable experimental studies of Rorschach indices of stress and conclude it is 
reasonable to reject Williams’ results as representative. At the completion of Will- 
iams’ procedure, Carlson and Lazarus administered Winne’s Neuroticism Scale, 
adapted from the MMPI. The only reliable correlation they obtained in all of their 
analysis was one of +.54 between scores on the Winne scale and “‘improvement under 
stress.”’ Also, this scale, unlike any of the Rorschach variables, yielded a correlation 
approaching significance with ratings of overt signs of stress made by the experi- 
menters. 

The remaining papers of Category # 1 include five studies of the TAT or adapt- 
ations of that instrument, five studies of the H-T-P technique, four studies of the 
Bender-Gestalt, four studies of figure drawings, two studies of the Szondi, a study of 
a graphomotor projective technique, and a study of the 8-card redrawing test. 

Two of the TAT studies were concerned with the basic issue of the relationship 
between the extent of ambiguity of the pictorial material and the amount of person- 
ality projection in themas. Weisskopf-Joelson and Lynn“) selected ten pictures 
from Bellak’s Children’s Apperception Test for administration to fifty children, 9 
years to 9-11 years old. An experimental and control revision of these pictures was 
prepared; the experimental set consisted of very incomplete tracings of the contours 
of the standard pictures, and the control set presented complete contours of the 
drawings. The experimental set was administered first, individually, followed in 
two weeks by the control set. Applying Weisskopf’s previously developed ‘‘trans- 
cendence index’ as a measure of the amount of projection elicited in the protocols 
under the two test conditions, the authors found a statistically reliable difference in 
favor of the less ambiguous materials, i.e., the children projected more personal 
material in their stories for the completely outlined figures. This finding agrees with 
that earlier reported by Weisskopf for a study of adults. 

Kenny and Bijou “*), on the basis of earlier rankings by 51 judges, assigned fif- 
teen cards of the Murray TAT series to three sets of five cards each, each set repre- 
senting one of three levels of ambiguity. The fifteen cards were then individually 
administered to eighteen male college students and subsequent analysis made of the 
extent of personality data revealed in the stories produced to each of the cards. The 
cards in the medium ambiguity set were found to yield reliably greater amounts of 
personal projection than those of the least or most ambiguous sets. Further, when 
separation was made into cards from the first half and the second half of Murray’s 
series, the latter considered more ambiguous by Murray, pooled judgments showed 
no reliable difference between the two sets with respect to the extent of personality 
factors revealed in the themas. The authors interpret their findings to throw serious — 
question on the general hypothesis that with increasing stimulus ambiguity there is 
increasing personality projection through phantasy. 

A concise study of one aspect of Bender-Gestalt responses is reported by 
Peek“) who made an analysis of the frequency of occurrence of forty different per- 
sonal and social history items in two groups of male neuropsychiatric patients. One 
group consisted of 75 patients who had drawn the diagonal projection of Figure 5 in 
the most frequent manner, i.e., away from the inverted half-circle of dots. The 
other group consisted of 75 patients who had drawn the diagonal projection toward 
the semi-circle. A total of 36 clinical variables was found to differentiate reliably 
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between these two samples. The “deviant” group revealed a greater number of 
complaints, a larger proportion of headaches, respiratory ailments, muscular pain, 
weight loss, and anorexia. Also, they were characterized more often as immature, 
passive, inadequate, and dependent and more frequently exhibited hostile-aggressive 
behavior than did the control group. In view of the current popularity of the Bender 
for personality study, there is urgent need for further objective validation of this 
type. 
In the area of figure drawing analysis, Whitmyre “*) reports a study of the re- 
lationship between the artistic excellence of drawings and the degree of personal 
adjustment inferred from them. Fifty sets of human figure drawings, one of each 
sex, were obtained for a sample of male, World War II veteran psychiatric patients 
between the ages of 22-33 and with W-B IQs of at least 100. Drawings were also 
obtained from 50 veterans who were functioning adequately in daily life, had no 
history of neuro-psychiatric difficulties, and who were matched with the psychiatric 
group for age, race, and educational achievement. Two sets of fifty drawings each, 
with 25 “normal” and 25 “psychiatric”? subjects represented in each set were then 
submitted to eight advanced art students for ranking as to artistic merit. They were 
also submitted to two six-man groups of Ph.D. clinicians with each group ranking 
one of the sets of drawings for artistic merit and the other for level of adjustment, 
and vice versa. Reliability coefficients for the various judgments, obtained by cor- 
relating halves of each group of judges, ranged from .884 to .943. Both the ratings 
of artistic merit made by the artists and those made by the psychologists showed 
high positive relationship (>.700) with the psychologists’ ratings of adjustment. 
When point-biserial correlation was used to obtain the extent of relationship of 
the artists’ ratings for artistic merit and the psychologists’ ratings of adjustment 
with the psychiatric patient versus non-patient criterion, the respective values were 
.095 and .237. Whitmyre concludes, ‘““Human figure drawings executed by persons 
of average or above-average intelligence seem to indicate art achievement but do 
not seem to show any consistent relationship to level of personal adjustment.” 

Because of the great range of subjects covered, it is difficult to summarize the 
researches included in Category #2, normative studies of personality. Included 
here were seven papers on schizophrenia, two papers on the concept of rigidity, 
two papers on delinquency, and two papers on factorial approaches to the classifica- 
tion of children and out-patient problems. A brief article by Edwards and Harris“ 
presents an interesting and currently rare type of study in experimental psycho- 
pathology. Using finger tremometer to measure tremor in three dimensions, a sample 
of 129 state hospital schizophrenics was tested and retested after an interval of 33 
months. However, patients classified as clinically improved showed a reliable de- 
crease in finger tremor. The sex difference with respect to amount of finger tremor 
was reversed for the schizophrenics as compared with the normals who show greater 
tremor in males. 

Krasner“) reports an extensive analysis of psychological differences between 
psychosomatic and non-psychosomatic veteran patients. All patients were native, 
white, male veterans between the ages of 20-40. Psychosomatic disturbance was 
represented by two samples: 30 patients with duodenal ulcer and 27 with ulcerative 
colitis. The non-psychosomatic group was composed of 44 patients with inguinal 
hernia or pilonidal cyst. The three groups were well matched with respect to age, 
education, and average income. Tests administered included the W-B (Form I), 
Guilford-Martin Inventory (STDCR, GAMIN, OAgCo), Thurstone Interest 
Schedule, and a questionnaire designed to fill gaps not covered by the other inventor- 
ies. Compared with the non-psychosomatic group on the Guilford factors, the 
psychosomatic patients reported themselves with reliably greater frequency as being 
shy, having greater disposition toward flightiness and instability, as socially passive, 
and lacking in confidence. With intelligence equated, no reliable differences ap- 
peared among the groups in vocational interest. Krasner concludes that the “differ- 
ences between the ulcer and the colitis group were not large enough to provide evi- 
dence that there were personality differences between these two groups.” 
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In view of the growing social concern for the problems of our aged population, 
it is disconcerting to find only a single study in the area of gerontology. Lebo“® 
reports a survey of 383 subjects over age 60 in which a questionnaire was used to 
determine factors related to greater or less happiness as evaluated against the pre- 
60 years. The influences of finances, hobbies, friends, sex, age, and presence of spouse 
were studied. The happy and unhappy groups were particularly distinguished by 
the income variable, a reliably greater percentage of the unhappy group receiving 
less than $50 per month. Number of hobbies did not bear a significant relationship 
to happiness, but the number of close friends and out-of-state visitors was an im- 
portant variable. A significantly greater percentage of females were unhappy in 
their old age than were males. 

Delinquency represents another area of currently high social concern in which 
surprisingly few studies appear in this survey. Wattenberg and Quiroz®® report a 
follow-up study of boys with police records. The files of the Youth Bureau of the 
Detroit Police Department were used to obtain names of all ten-year-old boys with 
a police record for 1948. The names of these boys were then sought in the files for 
1950. Eighty per cent of the original 1948 sample had gone for at least a full year 
without report of any further delinquency. The recidivists and non-recidivists were 
compared on 50 items of the Youth Bureau history sheet. Factors found predictive 
of repeated delingency were: having two or more brothers, living in an apartment or 
rooming house, having a reputation as a “bad boy.” 

Included in Category #3, normative studies with projective techniques, are 
five papers each on the Rorschach, Rosenzweig P-F Study, and figure drawing; two 
papers on the TAT, and a study with the Three-Dimensional Apperception test. 
Fiske and Baughman? report an extensive study of the relationship between the 
total number of Rorschach responses and the frequency of the major scoring categor- 
ies. Two basic samples were used: 633 Rorschach protocols obtained from patients 
at a V. A. mental hygiene clinic and the 157 protocols used in Beck’s recent norma- 
tive study. Contingency coefficients were used to determine the extent of the re- 
lationship between R and the various other scores; these values ranged from —.26 
to +.63. The relationships between R and the various scoring categories were found 
to be frequently non-linear, to have complex and varying patterns from category to 
category, and to be fairly similar for the normal and outpatient samples. They 
conclude, in agreement with Cronbach, ‘‘that scores based on frequencies of responses 
in particular scoring categories are unsatisfactory psychological measures and that 
taking these scores as percentages of R is only a partially adequate solution to the 
problem.” 

A major normative study of children’s drawings of human figures is reported 
by Weider and Noller“. Subjects were 210 boys and 228 girls, between the ages 
of 7 and 12, in the third grade of a public school. The IQ range was 70 to 140. A 
group testing procedure was used in obtaining the drawings. Data is reported on 
the relationship between age, sex, and intelligence and sex of first figure drawn, sex 
of larger figure drawn, and placement of figures. 

The paper by Winfield and Sparer“® on the Rosenzweig P-F Study responses 
in attempted suicide is of particular note because of the detailed descriptive informa- 
tion provided on the 26 subjects, all white, male veterans. Their mean age was 34 
years with a range of twenty years; they averaged ten years of education and had a 
mean estimated IQ of 104, with a range from 69 to 139. Eighteen were married and 
only four were single. Cutting instruments, usually razor blades, had been used in 
nearly 50% of the 34 attempts made by the 26 subjects. Next in order of frequency 
were sleeping pills and firearms, five uses each. It was hypothesized that patients 
with a history of attempted suicide would show a higher frequency of intropunitive- 
ness on the P-F Study than Rosenzweig’s norm group. This hypothesis was not con- 
firmed. Of the seven major scores yielded by the P-F Study, the suicide and norm 
groups were essentially alike on five. The suicidal group did show reliably less extra- 
punitiveness and greater impunitiveness than the norm group. 
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Brief mention must be made of a few studies in Category #4, normative studies 
of intelligence. Heald and Stanley“) provide some much needed normative data 
for performances on the Stick Test and Color-Form Sorting Test from the Goldstein- 
Scheerer battery. Their subjects were 72 boys and 66 girls ranging in age from 6-4 
to 11-3 years, drawn from the first three and first five grades respectively of two 
elementary schools. Scores on the two tests are reported in the form of distributions 
for each chronological age. It would appear that both of these tests have a ceiling 
at about C. A. 10 in normal children. These data should be helpful to clinicians who 
have found the tasks to be of general interest in studying deficit but have not known 
how to reasonably calibrate their findings. 

Two papers, by Norman and by Strange and Palmer, call attention to sex 
differences in performance on the W-B. Norman °? studied a sample of 85 male and 
68 female normals, all with IQs above 120, in the age range 15-29. The mean Verbal 
1Q of the males was reliably higher than that of the females, and a reliable difference 
in the opposite direction obtained for the Performance IQ. Both subtest means and 
intra-subtest item scores were tested for reliability of sex differences with several 
significant findings resultings. 

Strange and Palmer *) report an analysis of sex differences in the W-B records 
of 145 male and 90 female psychiatric out-patients routinely tested. Of fourteen 
tests run (on the eleven subtests and three IQs), eleven showed reliable sex differ- 
ences. The need for attention to possible sex differences in studies involving Wechs- 
ler subtests is clearly indicated. 

The studies included in the first four categories of Table 1 comprise 53% of the 
total of 186 studies in this survey. Detailed comment on the remaining categories 
is prevented by limitations of space. Note must be made, however, of the striking 
paper by Hovey and Stauffacher“*. They report a research in which intuitive 
clinical readings of MMPI profiles were tested against statistical manipulations of 
the same MMPI data in predicting the assignment of certain trait adjectives to 47 
student nurses by supervisors. A basic trait list based on a previous study was used 
for collecting the criterion and for expressing both the ‘“‘mechanical’”’ and “clinical” 
predictions. Using the most critical of two levels of criterion, the number of “‘hits’”’ 
to ‘‘misses’”’ was 5.8:1 for the mechanical predictions but only 2.1:1 for the clinical 
method. Of a total of twenty studies comparing actuarial with clinical methods, 
Meehl “® finds this to be the first research in which the clinical approach appears 
more efficient. The Hovey-Stauffacher study is not strictly a comparison of clinical 
versus statistical prediction in which profile analysis is used in both approaches. 

In the area of objective evaluation of psychotherapy (Category #6), attention 
is merited by the report of Barron? of part of an extensive, objective analysis of 
factors related to response to therapy. Particularly notable are the operationally 
sophisticated criteria for evaluating improvement or lack of improvement, the care- 
ful testing for reliability of the ratings of the criterion, and the intensive analysis of 
the test data. 

In light of the prominence of controversy concerning the specificity and power 
of psychiatric treatment, it seems a healthy sign that studies concerned with ob- 
jective evaluation of therapeutic outcomes continue to rank high (Category #6). 
It is disconcerting that Category #14, analysis of recorded interviews, is not repre- 
sented by a larger number of studies inasmuch as it is this category which contains 
studies of the psychotherapeutic process. In the 1949 survey, researches of this type 
ranked second in order of frequency, but the category has subsequently failed to 
appear higher than seventeenth place. While it is not to be hoped that the complex 
process of psychotherapy may be thoroughly grasped through study of electrical 
transcriptions, yet it seems patent that the part of the process which derives from 
the content of verbal exchange must be objectively scrutinized through the medium 
of such recording. A scientific basis for instruction in therapeutic technique demands 
exemplification of the type which can best be achieved through analysis of recorded 
interviews. It is unfortunate that relatively so little work of this kind is being done 
and that it tends not to be a widespread research interest. 
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DISORDERS OF NEURO-PSYCHIATRIC PATIENTS IN PERCEIVING 
PICTURES 


J. P. S. ROBERTSON* 
Netherne Hospital, Coulsdon, England 


INTRODUCTION 


Disturbances of visual perception are commonly investigated in simplified, 
schematic situations or by impoverishment technics such as tachistoscopic exposure. 
Recent work in the theory of perception, however, suggests that such studies should 
be amplified by others using more realistic situations seen in full scrutiny“: ©. The 
perception of pictures is of interest here both as a closer approximation to perception 
of the real environment and because of its intrinsic importance in civilized life®?. 
The following inquiry was concerned with disorders in this activity. What consti- 
tutes a perceptual disorder is often left undefined but it may conveniently be re- 
garded as a way of perceiving which (a) does not correspond to the facts or is other- 
wise inefficient, and (b) is more frequent, to a statistically significant extent, amongst 
pathological than adequately adjusted persons. 


MATERIAL AND INSTRUCTIONS 


It was desirable that the pictures to be perceived should be fairly complex, 
varied in content, distinctively colored and unlikely to have been seen by any sub- 
ject on a previous occasion. Suitable material was found in some old issues of a 
foreign illustrated magazine, which were available. Twelve colored photographs 
were selected, depicting the following scenes: (a) a woman standing in an orchard 
and looking at some fruit blossoms; (b) some men playing billiards and drinking 
beer in a club-room; (c) some high buildings in a city beside a river; (d) a herd of 
cows in a field with farm buildings behind; (e) a hydro-electric dam and water- 
channels; (f) a woman sketching beside a palm-tree in the tropics; (g) some children 
in bed in a summer-camp dormitory; (h) a view of sun-flowers and a river, with a 
village in the distance; (i) some men in a gymnasium watching another in the act of 
vaulting a cross-bar; (j) some men prospecting for oil in a desert with trucks behind 
and a heavy cloud of smoke at the side; (k) a man and woman in white coats watch- 
ing a gauge in a laboratory; (1) a display of ornamental glassware set out on tables 
with fine textiles hanging behind. 

Each photograph included many more details. The pictures were cut out and 
gummed in the order named against the right-hand pages of an album. There was 
some disparity in area amongst them (range 231-644 sq.cm., mean 378.92, S.D. 
121.82) but subsequent examination indicated that this had no statistically sig- 
nificant effect. The initial instructions were: ‘‘I’m going to show you some photo- 
graphs and I want you to tell me everything you can see in them. It’s not a memory 
test, so tell me as you’re looking. You’ll have one minute for each picture, to tell 
me all you can see in it.”” The task, therefore, was strongly inclined towards enumera- 
tion or description and away from narration, interpretation or esthetic evaluation “?. 
The subjects were permitted to hold the album at any angle convenient to them. 
They were required to keep looking at each picture to the end of the time-limit and 
not to turn the page until the examiner gave a signal. Their entire verbal production 
for each picture was recorded in longhand except where its speed and amount made 
shorthand obligatory. Obscurities in statements were clarified after each picture. 


*The author’s thanks are due to Dr. R. K. Freudenberg, physician superintendent, Netherne 
Hospital, for facilities to conduct this research. A preliminary communication of the findings was 
made at a meeting of the Surrey Clinical Psychologists’ Group in November, 1953, and the author 
is grateful for criticism offered then. 
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SUBJECTS 


Eight categories were sampled from the in-patients of Netherne Hospital: (a) 
bright convalescents; (b) dull convalescents; (c) uncomplicated hypophrenics; (d) 
chronic patients, level 1; (e) same, level 2; (f) same, level 3; (g) same, level 4; (h) 
organics. 

The convalescents were defined as recent patients who had satisfactorily com- 
pleted their treatment and were in convalescent wards awaiting an early discharge 
to the outside community; they were called bright if their Wechsler Vocabulary 
score were 32 or more, dull if it were 18-25 inclusive. The wncomplicated hypophrenics 
were defined as certified mental defectives in whom no evidence of superposed psycho- 
sis had ever been adduced and whose Wechsler Vocabulary score was 15 or less. 
The chronics were defined as patients who had been in hospital at least three years 
and whose return to the outside community was very unlikely. They were con- 
sidered to be at level 1 if they were well-behaved, lived in unlocked wards on parole, 
worked satisfactorily somewhere in the hospital and had never, in psychological 
tests or neuro-psychiatric examinations, given evidence of bizarre thinking apart 
from the expression of rational delusions. They were considered to be at level 2 if all 
this were the case except that tests or examinations had elicited evidence of bizarre 
thinking. They were allocated to level 3 if they were badly-behaved and confined to 
locked wards but worked under supervision, maintained adequate personal clean- 
liness and evidenced bizarre thinking in tests or under special examination, but only 
then. They were allocated to level 4 if they were confined to locked wards, did no 
work, neglected personal cleanliness and showed evidence of bizarre thinking in 
casual contacts. The organics were defined as patients in whom structural damage 
of the cerebral cortex had been neurologically demonstrated, such patients being 
excluded from the other categories. 


Lists of patients satisfying these criteria were available from previous researches 


and random samples of four males and four females in each category were secured. 
The convalescents could be regarded, from the evidence of other psychological in- 
vestigations, as closely approximating a normal control group, and the organics 
could be considered a control group of another sort. The four levels of chronics 
constituted a gradient of impairment in social behavior and of disorganization in 
thinking. Similarly the bright and dull convalescents and uncomplicated hypophren- 
ics were at three levels in a gradient of general intellectual ability. 


VARIABLES 


The records were carefully scrutinized and the following variables defined. 
Their frequencies were independently counted by two persons. Particulars of their 
distributions in the total sample are set out in table 1. 


(a) Total number of clear percepts. What is to be taken as a unit of visual perception is neces- 
ily a conventional matter. Here each word which isolated from a picture an t that could 
be visually perceived was treated as referring to a distinct and definite percept. More precisely, 
each substantive, adjective, verb and adverb whose referent could be visually demonstrated in a 
picture was counted as a separate percept. For example, ‘a yellow straw hat with a red ribbon’ 
would be regarded as comprising five separate percepts, ‘yellow’, ‘straw’, ‘hat’, ‘red’ and ‘ribbon’, 
without consideration of whether these were isolated simultaneously or successively. Where S 
isolated the same aspect of a picture more than once, only the first reference was counted. 


(b) General characterizations. Summarizing descriptions of an entire picture, e.g. ‘“‘a scene 
in the tropics.” 


_ (ce) Locating references. Phrases or words indicating the spatial position of an object in a 
picture, e.g. “in the background”’, “to the left’’, ete. 


_ (d) Counting references. Instances where S gave the number of occurrences in the picture of 
objects of a particular class, e.g. “seventeen cows.” 


(e) Wrong counting references. Counting references that named a precise but incorrect num- 
ber. If S prefaced the number by “about”, “maybe”, etc. it was not considered as wrong counting. 
(f) Color references. Adjectives or, rarely, substantives of color describing part of the picture. 


_ (g) Unrelated color references. Instances of naming a color in a picture without naming the 
object to which it referred. 
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h) Esthetic references. Descriptions of a picture or an object in it as “beautiful”, “nice”, 


ugly”, “nasty” and the like. 


(i) Inferences. Statements inferred from a picture but not actually represented there, e.g. 
“‘they’ll be going to milk those cows soon.” 


(j) Minute details. Instances where § referred to an unemphasized portion of a picture less 
than 25 sq. mm. in area. 


(k) Absences of central object(s). Instances where S did not refer to the object(s) given major 
emphasis by the composition of the picture. 

(1) Delayed references to central object(s). Instances where S named three or more percepts 
before naming a percept relevant to the central object(s) of a picture. 


(m) Negative references. Statements that some object was absent from a picture, e.g. ‘““There 
ought to be a dog but I can’t see him.” 


(n) Wrong percepts. Instances of identifying a depicted object as something quite other than 


it actually was, e.g. calling cows ‘Alsatian dogs.’ 


(0) Introduced . Instances of naming an object to which nothing whatever in the pic- 
ture corresponded. These were distinguished during administration from the preceding by the 
fact that S could point to a referent for wrong percepts but not for introduced objects. 

(p) Perceptual difficulties. Instances where S could point to a depicted object but could 
neither name it nor describe its appearance except in vague terms, e.g. “something there, can’t 
make out what, white, a line, face there” for clearly delineated scientist in white coat. 

(q) Interpretation difficulties. Instances where S could point to an object and describe its 
appeenee precisely but could not give its name or function, e.g. “long black rectangular metallic 
object, don’t know what for’, for gasoline can in shadow. 

(r) Verbalization difficulties. Instances where S could point to an object and also describe 
its function and appearance precisely but could not recall its name. 

(s) Size, distance and perspective difficulties. Instances where S remarked on the small size 
of a distant object or the large size of a near one, together with instances where a vertical arrange- 
ment of objects was described as one backward along the ground or vice versa. 

(t) Unfamiliarity references. Statements that objects looked strange in some way. Owing to 


cultural differences some such statements were strongly defensible and these were not counted, 
so that this variable was to some extent subjective. 


(u) Familiar recognitions. Statements that S had either actually seen what was depicted or 
had seen the photograph before. The truth of all such statements could be rejected. 

(v) Self-references. Statements that a picture represented situations or actions of S himself 
or his close associates. 

(w) Over-real percepts. Statements, accompanied by evident excitement, that what was de- 
picted was actually happening or present at that moment. 


TaBLE 1. DistRIBUTIONS OF VARIABLES IN ToTAL SAMPLE 








Variable Range Mean 


7) 
S 





Total no. of clear percepts 4 — 296 69. 
General characterizations 0- 
Locating references 0- 
Counting references 0- 
Wrong counting references 0- 
Color references 

Unrelated color references 

Esthetic references 

Inferences 

Minute details 

Absences of central object(s) 

Delayed references to central object(s) 

Negative references 

Wrong percepts 

Introduced objects 

Perceptual difficulties 

Interpretation difficulties 

Verbalization difficulties 

Size-distance-perspective difficulties 

Unfamiliarity references 

Familiar recognitions 

Self-references 

Over-real percepts 
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RESULTS 


The statistical significance of differences in each variable between the various 
categories was tested by the Mann-Whitney U technic“. Differences significant 
at the 1 percent and 5 percent levels are shown in table 2. The categories, in the 
order already listed, are denoted by the abbreviations B, D, H, 1, 2, 3, 4 and 0 
respectively. The two categories in a given significant difference are joined by a 
stroke. The category coming first displayed the variable to the greater extent. 


TaBLeE 2. Mann-Wuitney U Tests 








Variable Significant Differences 
Total percepts (1%) rit a B/4, B/0; D/H, D/3, D/4, D/0; H/4; 1/4, 1/0; 


(5%) B/2; D/2; 1/H, 1/3; 0/4. 
General characterizations (1%) B/4, B/O; 1/H, 1/4, 1/0. 
(5%) B/H, B/3; D/A, y/o; 1/3. 
) Bit B/0; D/A, 
) B/H, B/1, B/2, Bis: D/H, D/2, “y 1/4, 1/0; 2/4; 3/4. 
) B/H, Bias «yy D/1, D/3, D/A, D/. 
) B/I, ; D/2 
)E 
) 
) 





Locating references 


Color references /H, i. D/H, D/4; 1/H, 1/4; 2/H, 2/4. 
iD oh D/0; 1/0; 2/0; 3/H, 3/4; O/H. 
3, 1/0 
/D; {74s 2/D, 2/H; 0/D, 0/H. 
/D, B/H, B/4, B/0; 1/H, 1/4, 1/0; 3/H. 
/3; 3/4, 3/0. 
/B, 3/D, 3/H; 4/B, 4/D, 4/H, 4/1, 4/2, 4/0. 
/B, 2/D; 3/1; 4/3; 0/B, 0/D. 
/B; 0/B, ’0/D, 0/H. 
/B; 2/B, 2/D; 3/B, 3/D; 4/D, 4/H. 
/B, 4/D, 4/H, 4/1, 4/2, 4/3. 
0; 0/B, 0/D, 0/H, 0/1, 0/2, 0/3. 
> B/4; 0/2, 0/4. 
/B, 3/ /D, 3/H. 
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In regard to the effect of general intellectual ability these variables can be 
dichotomized into two sets, six displayed significantly more often by bright con- 
valescents than by uncomplicated hypophrenics and six in which these categories 
were not distinguished. The dull convalescents were indistinguishable from the 
bright convalescents in five variables of each set. 

In the gradient of disorganized thinking the chronics at level 1 were indisting- 
uishable in any variable from those at level 2, showed three significant differences 
from those at level 3 and eight from those at level 4; those at level 2 were indis- 
tinguishable from those at level 3 and showed five significant differences from those 
at level 4; those at level 3 also showed five significant differences from those at 
level 4. If the chronic levels are compared with the bright convalescents in the 
variables where the latter differed from the hypophrenics, level 1 was significantly 
lower in two variables, level 2 in two, level 3 in five and level 4 in six variables. 
Compared with the dull convalescents in the same regard, level 1 was lower in one 
variable, level 2 in three, level 3 in three and level 4 in five variables. Compared with 
the hypophrenics, level 1 was significantly higher in four, level 2 in one and level 3 
in two variables, but level 4 was significantly lower in one. In the set of variables 
where bright convalescents and hypophrenics were indistinguishable, the chronics 
at level 1 displayed significantly more of one variable than the bright convalescents, 
level 2 significantly more of two, levels 3 and 4 significantly more of three. Com- 
pared with the hypophrenics, levels 1 and 2 showed more of one variable each, level 
3 of two-and level 4 of three variables. 

The organic patients corresponded in most relationships to a position somewhere 
between the chronics of levels 3 and 4. In the second set of variables, Esthetic 
References and Verbalization Difficulties showed individual patterns sharply de- 
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marcating them from the other four. Esthetic References were especially low in 
the Dull Convalescents and Verbalization Difficulties especially high in the Bright 
Convalescents and Organics. The latter finding was probably to be explained by 
there being two sorts of verbalization difficulties, those arising from selectiveness 
with regard to the most appropriate word and those arising from amnestic aphasia 
for common objects. 


It is concluded from the foregoing that: 


(a) Low occurrence of any of the six variables Total Percepts, General Char- 
acterizations, Locating References, Counting References, Color References and 
Inferences corresponds to a low level of general intellectual efficiency, due to either 
innate defect or impairment and such low occurrences can be taken as perceptual 
disorders according to the definition of this article. 


(b) Occurrence of the variables Absences of Central Object(s), Wrong Per- 
cepts, Perceptual Difficulties and Familiar Recognitions corresponds not to low 
intellectual efficiency but to increasing severity of thought disturbance or bizarre 
thinking and these variables are perceptual disorders, understanding ‘perceptual’ 
in the narrowest sense. 

(c) Low occurrence of Esthetic References and occurrence of Verbalization 
Difficulties are distinct phenomena from the two preceding sets. 


In regard to the variables not showing significant differences it was the case 
that, if the sample size were multiplied threefold and the relative occurrence re- 
mained the same, occurrence of Wrong Counting and Size-distance-perspective 
Difficulties would belong with the first set above; occurrence of Introduced Ob- 
jects, Self-references and Unfamiliarity References would belong with the second 
set. Minute Details, Negative References and Interpretation Difficulties were dis- 
persed evenly over the categories. Delayed Central Object(s) References, Unrelated 


Color References and Over-real Percepts were too rare to suggest any conclusions. 

It was not expected that there would be any sex differences in this domain. 
Actually, when tested by the Mann-Whitney U technic, General Characterizations 
were significantly more common in males (1% level) and Absences of Central Ob- 
ject(s) and Wrong Percepts significantly more common in females (5% and 1% levels 
respectively). As there was no evidence of a significant interaction between sex and 
category this matter was ignored. 


DIscuUssION 


The main conclusion is that there are two chief kinds of disorders in perceiving 
pictures, those dependent .on general intellectual impairment or inefficiency and 
those not so dependent but related rather to bizarre thinking. The distinction be- 
tween the two rests largely on the performance of the uncomplicated hypophrenics. 
It corresponds to the fact that such persons, within the limitations imposed by their 
simple thinking and outlook, are shrewd and realistic in their response to their im- 
mediate environment. The second set of variables really, no doubt, involves numer- 
ous factors differing for each variable and perhaps including Bender’s extinction 
phenomenon“? and impairment in Thurstone’s perceptual factor A“). 

Certain alternative explanations of the findings should be considered: (a) de- 
fects in visual acuity; (b) inattention and poor motivation; (c) differences in verbal 
fluency; or (d) differences in original intellectual ability amongst the chronic and 
organic categories. 

Markedly defective visual acuity undoubtedly produces perceptual disorders 
identical with several of those listed here. Great pains were taken, however, by 
checking with the medical staff and the patients themselves, to exclude from the 
investigation any person with uncorrected sensory defects of vision. If some such 
cases were overlooked, they are hardly likely to have been frequent enough to pro- 
duce the results obtained. 
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Observation suggested that inattention was not a frequent behavior in this 
situation and, when it occurred, the subject’s scrutiny was at once re-directed to the 
picture. The truth about motivation is difficult to ascertain but the most eagerly 
cooperative subjects were the hypophrenics, chronics and organics, those most ad- 
verse to the investigation the bright and dull convalescents. The patients in the 
former categories were nearly all old acquaintances of the examiner and seemed 
pleased to carry out the task as a relief from monotony and a means of expressing 
their individuality. The latter categories comprised patients virtually unacquainted 
with the examiner and resentful of the task because they could not see any advantage 
in it for them. The effect of motivation, therefore, would seem likely to act in the 
opposite direction to the findings, not to be responsible for them. 

Differences in verbal fluency are clearly more relevant to some of the variables 
than others. The verbal fluency of all subjects was tested by asking them to name 
as many of the following as they could in one minute for each: (a) boys’ names; (b) 
girls’ names; (c) animals; (d) towns or villages; (e) words beginning with §; (f) words 
beginning with D. The total score, omitting repetitions, was correlated with each 
of the investigated variables. In order to eliminate the effect of general ability and 
verbal comprehension, partial correlations allowing for the regression on Wechsler 
Vocabulary score were calculated. With general ability and verbal comprehension 
thus partialed out, only one investigated variable correlated to a statistically sig- 
nificant extent with the verbal fluency score, viz.: Inferences. Therefore it hardly 
seems that verbal fluency can have made an important contribution to the findings. 

The fourth explanation implies that there was a gradient of original intellectual 
ability down the four chronic levels and the organics. There were, however, no 
statistically significant differences in Wechsler Vocabulary scores amongst these, 
except for the chronics at level 4 who were significantly lower than the others, 
owing to the eruption of thinking disturbances into their answers. The other 
categories occupied a mean position more or less midway between that of the bright 
and dull convalescents. Data on the education and former occupations of the 
chronics at level 4 suggested that in original ability they too probably fell at this 
mean position. 

It is evident that many of the phenomena shown in perceiving pictures are 
closely analogous to those shown in perceiving inkblot stimuli. It is hoped to 
pursue this matter in a later article. 


SUMMARY 


Samples from eight categories of neuro-psychiatric patients were shown twelve 
colored magazine photographs and asked to describe all that they saw in each. 
Twenty-three variables in their descriptions were defined and counted and the 
categories were statistically compared in regard to each variable. It was concluded 
that there are two chief classes of perceptual disorders in this situation, those de- 
pendent on general intellectual impairment or inefficiency and those not so de- 
pendent but related to bizarre thinking. 
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PROBLEM 


Knowledge of intellectual impairment, i.e., the discrepancy between present and 
premorbid intellectual functioning, has long been considered of significance in the 
study of the psychiatric patient. Very early, the Stanford-Binet was used to yield 
information relative to the impairment process through the study of scatter. More 
recently, the Wechsler-Bellevue has been used extensively for this purpose because its 
construction lends itself more readily to the analysis of scatter. In order to ob- 
jectify- and quantify scatter as an index of impairment, Wechsler has devised a 
ratio between certain of the subtests in his scale which are resistive to the impair- 
ment process and those tests which are readily affected by such a process®). While 
studies of this ratio have questioned its validity “’, psychologists in clinical practice 
have continued to utilize the rationale underlying the use of this ratio for evaluating 
impairment, i.e., psychological functions differ in their resistance to impairment and 
the functions which are resistive to impairment are indices of premorbid function- 
ing.. In the clinical situation, the data from the Wechsler-Bellevue are usually in- 
tegrated with educational and vocational attainments and other facts from the case 
history in order to estimate premorbid intellectual functioning. 

This method for evaluating premorbid intellectual functioning, however, may 
not be applicable to the severely impaired patient for the following reasons: (1) 
Where the mental illness results in marked reduction of functioning, the patient may 
show a pervasive impairment affecting all psychological functions with the result that 
scatter is minimized. (2) Where the illness is manifested before maturity, the patient 
may not achieve the attainments of which he is capable and which would ordinarily 
be used as a basis for estimating premorbid intelligence. 

This study is concerned with answering the following questions relative to 
psychologists’ estimates of premorbid intelligence in severely impaired patients: 
(1) Do psychologists agree in their estimates of premorbid intelligence based on the 
Wechsler-Bellevue? (2) Do psychologists agree in their estimates of premorbid in- 
telligence based on case histories? (3) Do psychologists’ estimates of premorbid 
intelligence based on the Wechsler-Bellevue agree with their estimates based on the 
case history? (4) Do psychologists’ estimates of premorbid intelligence based on 
the Wechsler-Bellevue alone and on the case history alone agree with the judgments 
of the psychologists who had actually tested the patients? (5) Do psychologists agree 
in their reasons for estimating premorbid intelligence based on the Wechsler-Belle- 
vue and the case history when each is used separately? 


SUBJECTS 


The subjects of this study were ten patients whose Wechsler-Bellevue protocols 
and case histories were available in the psychology department files. The Wechsler- 
Bellevue total 1Q’s of these patients ranged from 54 to 74 with a mean of 69. Nine 
of the cases had been judged by the examining psychologists who were not the judges 
as being of average premorbid intelligence and one of bright normal premorbid intel- 
ligence. The patients ranged in age from 19 to 38 with a mean age of 28. One patient 
was diagnosed psychopathic personality, one manic-depressive and eight schizo- 
phrenic (three catatonic, three hebephrenic and two paranoid). 


The authors wish to express their appreciation to Sara Arnaud and David Shapiro for assisting 
in the analysis of the data. 
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PROCEDURE AND RESULTS 

The raters used in this study were seven psychologists at the Connecticut State 
Hospital. Three of them were staff psychologists and four were internes who had 
completed two years of graduate training in clinical psychology and who had at least 
six months of experience in diagnostic testing. 

Each Wechsler-Bellevue protocol included the complete responses of the patient 
as well as all scores achieved. The case history of each patient consisted of his 
personal and family histories, his educational and work adjustments, data on his 
present illness and other pertinent information. The Wechsler-Bellevue protocols 
and case histories were not identified in any way so that the patients were not known 
to the judges. 


1. Reliability of estimates based on Wechsler-Bellevue. 


The judges were asked to estimate the premorbid level of intellectual functioning 
of each patient in terms of an IQ score. They were also asked to state their reasons 
for each estimate and to rank them in order of importance. The estimates of each 
judge were compared to those of the other judges in the following ways: First, rank- 
difference correlations between each judge and every other judge were calculated. 
Second, the mean estimate of each judge was compared to the mean estimate of 
every other judge, utilizing the t-test of differences between means. 


TasBLe 1. RANK-DIFFERENCE CORRELATIONS BETWEEN JUDGES’ ESTIMATES 
oF PREMORBID INTELLIGENCE BASED ON THE WECHSLER-BELLEVUE* 








Judge 1 2 3 4 5 


2 .84 

.68 .87 

41 15 42 

-73 64 53 30 

64 82 74 21 .70 

.67 84 .87 38 64 .80 


*Correlations of .77 and .63 are significant at the one and five percent levels 
of confidence respectively (df =8). 











In Table 1 are presented the intercorrelations of the judges’ ratings. Fourteen 
of the 21 correlations are significant at least at the five per cent level. Six of the 
seven insignificant correlations were contributed by one psychologist (judge 4) who 
differed considerably from all the other judges. Table 2 lists the mean estimates of 
each judge and the ¢-values of the differences between the means of each judge when 
compared to every other judge. Ten of the 21 mean differences are significant at 
least at the five per cent level of confidence. 


TaBLe 2. ¢ VaLugs or DIFFERENCES BETWEEN MEANS OF JuDGES’ EsTIMATES OF PREMORBID 
INTELLIGENCE BASED ON THE WECHSLER-BELLEVUE 








t values* 
Judge : 3 Mean 
94.7 
1.12 91.5 
3.42 3.00 84.7 

.53 1.01 2.88 97.0 
1.69 47 1.72 89.8 
4.51 2.77 .32 2.80 2.46 83.7 
7 3.79 2.65 47 2.95 1.69 .97 85.6 


*t values of 3.25 and 2.26 are significant at the one and five percent levels of confidence 
respectively (df =9). 
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2. Reliability of estimates based on the case history. 

The procedure described above for the Wechsler-Bellevue was also used in this 
analysis. Table 3 presents the rank-difference correlations between judges’ ratings 
of premorbid intellectual functioning based on the case histories. Fifteen of the 21 
correlations are significant at least at the five per cent level. All of the statistically 
insignificant correlations are contributed by the same judge whose correlations based 
on the Wechsler-Bellevue were also insignificant (judge 4). 


TABLE 3. RANK-DIFFERENCE CORRELATIONS BETWEEN JUDGES’ ESTIMATES 
OF PREMORBID INTELLIGENCE BASED ON THE CasE History* 








Judge | 1 3 4 5 


| 
.89 
| . 
54 .é 31 
94 : 65 47 
.83 6: .63 .30 .79 
7 96 .90 64 57 .93 .73 


*Correlations of .77 and .63 are significant at the one and five percent levels of 
confidence respectively (df =8). 











When the mean of each judge’s ratings of all ten cases is compared to the mean 
ratings of each of the other judges, four of the 21 mean differences are statistically 
significant at least at the five per cent level (Table 4). Four other differences ap- 
proach significance at the ten per cent level. 


TABLE 4. ¢ VALUES OF DIFFERENCES BETWEEN MEANS OF JUDGES’ EsTIMATES OF PREMORBID 
INTELLIGENCE BASED ON THE CasE HIsToRY 








t Values* 
Judge : 3 4 Mean 
101.3 
4.90 108.8 
1.34 . 66 106.5 
1.17 1.40 31 104.9 
1.49 1.72 .50 Af 104.3 
1.36 3.62 2.04 ; 2.04 97.5 
7 3.73 2.09 .28 , .62 2.72| 105.4 


*t values of 3.25 and 2.26 are significant at the one and five percent levels of confidence 
respectively (df =9). 














3. Comparison of estimates based on Wechsler-Bellevue with estimates based on case 
history. 

A comparison was made between each psychologist’s judgments of premorbid 
intellectual functioning based on the Wechsler-Bellevue protocols and his judgments 
based on the case histories, utilizing rank-difference correlations. In addition, each 
judge’s mean rating based on the Wechsler-Bellevue records was compared to his 
mean rating based on the case histories, and a t-test of significance between means 
was calculated. 

The rank-difference correlations between each judge’s ratings based on the 
Wechsler-Bellevue records and the ratings based on the case histories are presented 
in Table 5. Only one of these correlations is significant at the one per cent level of 
confidence, and this correlation is in the negative direction. Five of the remaining 
six correlations are also negative. 
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TaBLe 5. RaNK-DIFFERENCE CORRELATIONS BE- 
TWEEN Eacu JupGsEs’ EstTmMaTes OF PREMORBID 
INTELLIGENCE BASED ON THE WECHSLER-BELLEVUE 
AND His Estimates BASED ON THE CASE HISTORY 








Judge Correlations 





-.15 











When the mean of each judge’s ratings based on the Wechsler-Bellevue records 
is compared with the mean of his ratings based on the case histories, five of the seven 
mean differences are significant at least at the six per cent level of confidence (Table 
6). When the mean of all the judges’ ratings on the Wechsler-Bellevue records are 
compared to the mean of all the judges’ ratings on the case histories (Table 6), the 
case histories are rated 14.5 points higher on the average. This is significant at the 
.001 level. 


TABLE 6. SIGNIFICANCE OF DIFFERENCES BETWEEN MEAN ESTIMATES OF PREMORBID INTELLIGENCE 
BASED ON THE WECHSLER-BELLEVUE AND MEAN Estimates BASED ON THE CasE History 








Mean Estimate | Mean Estimate 
Judge | Wechsler-Bellevue | Case History Difference t 


1 | 94.7 101.3 6 .13 


2 91.5 108.8 
‘ 
4 





2.48 


3 
106.5 21.8 2.31 
9 


7 

0 | 104.9 U 
8 104.3 

6 | 3.7 | 97.! 8 45 
7 8 


2.89 


5 4.5 2.16 








5.6 105.4 





All Judges 39.6 | 104.1 | 5 | 5.61 





Inspection of Tables 1 and 3 suggested that the intercorrelations between judges 
for estimates based on the case histories were greater than those based on the 
Wechsler-Bellevue. This was confirmed by statistical analysis in which the correla- 
tions were converted to z scores and differences between the mean z’s tested. The 
mean correlation for case history judgments was .76 as compared to .66 for Wechsler- 
Bellevue judgments. This difference is significant at the .05 level. 


4. Comparison of estimates based on original psychological examination with estimates 
based on the Wechsler-Bellevue and case history alone. 
The psychologists who originally tested the patients had estimated the pre- 
morbid intellectual functioning of these ten subjects on the basis of the Wechsler- 
Bellevue, case history, and behavior in the test situation. It was decided to compare 
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these estimates with the mean estimates made by the judges for each case on the 
basis of the Wechsler-Bellevue records alone and the case histories alone. However, 
since the original estimates were made in terms of a category of intelligence rather 
than an IQ score, quantitative analysis was not feasible. Qualitative analysis re- 
vealed the following: 

(a) In five of ten of the cases, mean estimates based on the Wechsler-Bellevue records alone 
show agreement with the estimates made by the original examining psychologists. For the re- 
maining five cases, the mean estimates based on the Wechsler-Bellevue records alone are below 
the estimates made by the original examining psychologists. 

(b) In six of the ten cases, mean estimates based on the case histories alone show agreement 
with estimates made by the original examining psychologists. Of the remaining four cases, three 
are above and one below the original estimates. 

While the estimates of Judge 4 were shown in Tables 1 and 3 to be consistently 
uncorrelated with the estimates of the other judges, Table 6 indicates that the 
estimates of Judge 4 on the Wechsler-Bellevue were closer to those made by the 
original examining psychologists. Thus, it would seem that while the six other judges 
correlated highly among themselves, they consistently underestimated the absolute 
IQ estimates. One judge (number 6) consistently underestimated on both the 
Wechsler-Bellevue and the case history (Table 6). 

A qualitative analysis indicated that the variations between judges was not re- 
lated to level of training of the judges. There was no greater consistency of per- 
formance among the staff psychologists as compared to the internes. 


5. Analysis of reasons for estimating premorbid intelligence. 

The specific reasons for estimating premorbid intellectual functioning based on 
the Wechsler-Bellevue alone were categorized by the authors as follows: (a) inter- 
test scatter, (b) intratest scatter, (c) verbal versus non-verbal test performance, and 
(d) quality of verbalizations. All judges agreed that the most important evidence for 


estimating higher premorbid intellectual functioning was the presence of intertest 
scatter. This was based on the ranking as well as the frequency of use of this reason. 
The other reasons were of varying importance to the judges. 

A tabulation of the reasons given by judges as the bases for judgments of pre- 
morbid intelligence from the case histories alone also was done. The reasons given 
were categorized as follows: (1) developmental history, (2) psychiatric opinion espec- 
ially as expressed in the mental status examination, (3) cultural as well as socio- 
economic background, (4) work history, (5) educational history, and (6) social 
adjustment. 

The tabulation of reasons revealed that educational history was the most 
important single criterion for estimating premorbid intellectual functioning from 
case history material. All seven judges used this category and ranked it first. The 
next most frequently given reason was the information available in the psychiatrist’s 
mental status. All judges made use of this information, although not all to the same 
degree. Work history and cultural, socio-economic background were ranked next in 
importance, with the remaining reasons infrequently used. 


DIscussION 

The results of this study raise serious questions concerning psychologists’ 
evaluations of premorbid intelligence in severely impaired psychiatric patients. 
Six of the seven judges agree fairly well on their premorbid estimates based on the 
Wechsler-Bellevue protocols and case histories insofar as their relative ranking of 
cases is concerned (Tables 1 and 3). The absolute IQ estimates of premorbid intel- 
ligence based on the Wechsler-Bellevue show considerable differences between judges 
(Table 2). While the absolute IQ estimates based on the case histories show some- 
what better agreement, even here considerable disagreement is discernible (Table 4). 
From a clinical point of view, these absolute IQ estimates are of greater significance 
since the clinician will usually postulate the absolute premorbid functioning of the 
patient rather than a relative ranking. 

When each judge’s estimates based on the Wechsler-Bellevue are compared in 
terms of relative ranking with his estimates based on the case history (Table 5), the 
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startling fact emerges that there is but one significant relationship and this is in the 
negative direction. Five of the seven judges show significant differences between 
their absolute estimates based on the Wechsler-Bellevue and those based on the case 
history (Table 6). 

When the judgments based on the Wechsler-Bellevue alone and the judgments 
based on the case history alone are compared tothe judgments made by the original 
examining psychologists who not only had these two types of data but also observa- 
tions of the patient in the testing situation, it was found that the independent 
Wechsler-Bellevue ratings tended to underestimate the premorbid intelligence as 
evaluated by the examining psychologists. Conversely, it was found that there was 
some tendency for psychologists’ estimates based on the case history alone to be 
higher than that given by the original examining psychologists. Thus in the cases 
of severely impaired psychiatric patients, Wechsler-Bellevue data tends to set a 
lower estimate of premorbid intelligence, whereas case history data tends to set a 
higher one. 

The study indicates that psychologists utilize more than a single criterion when 
using either the Wechsler-Bellevue or the case history for judging premorbid intelli- 
gence. But, not every criterion is used by all the psychologists, nor does every psy- 
chologist use identical criteria. It may be that herein lies a clue to the differences 
among judges found in this study—there is as yet insufficient knowledge regarding 
the relative importance of the various criteria or the significance of different criteria 
in different patients. Since judgments of premorbid intelligence must be based on 
criteria of premorbid functioning, without agreement on criteria there is likely to 
be little agreement on the judgments themselves. The almost complete absence of 
any relationships between estimates based on the Wechsler-Bellevue and those based 
on the case history clearly highlights the fact that the criteria used in the estimates 
of one bear little relationship to the criteria used in the estimates of the other. 

With the absence of uniform criteria, which consequently tends to reduce agree- 
ment, one implication for clinical practice seems obvious. Estimates of premorbid 
intellectual functioning on severely impaired patients should be based on all avail- 
able data about the patient. Until such time as either intelligence test data or case 
history or combinations of both can be validated against satisfactory criteria of pre- 
morbid functioning, the practice of utilizing data from only one source is not justified. 

It should be stressed that the results reported here cannot be generalized to 
judgments based on cases of minimal impairment. In cases of minimal impairment, 
these results may or may not apply. Ideally, a study such as this should be under- 
taken on patients with differing degrees of impairment and with judges from differ- 
ing clinical settings in order to make possible generalizations about psychologists’ 
judgments of premorbid intelligence. Even more critical is research on the validity 
of psychologists’ judgments. Perhaps one way of coping with the problem of a satis- 
factory criterion of premorbid intelligence would be to collect the records of a large 
sample of mentally ill patients with good premorbid test results, which test results 
could serve as the criterion of premorbid intelligence. 


SUMMARY 

In order to study psychologists’ estimates of premorbid intelligence based on 
the Wechsler-Bellevue and the case history, seven psychologists were asked to rate 
the Wechsler-Bellevue protocols and the case histories of ten severely impaired pa- 
tients for their premorbid intelligence. The results indicate that, while there is some 
agreement among judges in the estimates made, there is sufficient disagreement 
to indicate that this is an area that needs further research if the psychologist is to 
more adequately fulfill his responsibility in evaluating premorbid intelligence. 
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A TRANSPOSED FACTOR ANALYSIS OF SCHIZOPHRENIC 
PERFORMANCE ON THE BENDER-GESTALT 


WILSON H. GUERTIN 


Veterans Administration Hospital 
Knozville, Iowa 


PROBLEM 


Factor analyses of the Bender-Gestalt have provided some useful, though gross, 
information about the test © ©. However, the obtained factors appear to be rather 
unstable because the Bender productions fail to show enough interfigure consistency. 
Such inconsistency often provides useful diagnostic patterns in sequential analysis 
but is not handled effectively by conventional factor analysis. For example, one 
individual may exhibit inconsistency by failing to bring together the two elements 
of figure A but overlapping those of figure 4. Although this inconsistency might 
diminish the importance of factors in a conventional analysis, if it is common to a 
definitive group of individuals, it may contribute to factors in a transposed analysis. 
A transposed analysis (factor analysis of correlations between individuals) takes 
advantage of sequential patterning on Bender items®?. 

Test items must reflect the subject’s important individual differences if a trans- 
posed factor analysis is to be productive. Therefore, a satisfactory transposed analy- 
sis serves as indirect evidence of test validity. If the Bender-Gestalt is an effective 
instrument in the total evaluation of the personality of schizophrenics, it should be 
effective in differentiating types of schizophrenics from one another. It was hoped, 
too, that this study would lead to further evidence on the nature of phenomenal clus- 
tering of schizophrenic individuals and the Bender features underlying the clusters. 


PROCEDURE 


Thirty-two resident male patients with varied schizophrenic subtype diagnoses 
were employed. No particular attempt was made to control any of the incidental 
case-history variables, and each diagnostic subtype was sampled from the whole 
} »spital population except for the seriously disturbed or very deteriorated. The 
most relevant case-history variables probably would be the duration of illness and 
age. The mean age of 50.5 years and the mean number of years since first admission 
to the hospital of 13.5 years indicate the chronic nature of this sample. 

The test was administered in standard manner“). Each protocol was scored 
with respect to 100 different items. Distortions on the various figures were evaluated 
by a method which follows conventional scoring procedures, constitutes a rather 
exhaustive analysis of each protocol and has been elsewhere reported ®: 5. ®, 

Tetrachoric intercorrelations between the individuals were obtained from four- 
fold contingency tables and charts. The resulting intercorrelation matrix was factor- 
ed by the multiple-group centroid method“. Communalities for those individuals 
within a cluster were first estimated by employing the individual’s largest correlation 
within the cluster. Those outside the clusters had communality estimates based 
upon the highest intercorrelation in the whole column. Reestimates of the communal- 
ities were made after the first factoring, and plots of the factor space were studied 
in order to provide more stable clusters. 

Malamud-Sands Rating Scales® were obtained at the time the Bender was ad- 
ministered. These psychiatric rating scales provided the behavioral and symptom 
information about each of the patients necessary for recognizing those personality 
characteristics underlying each of the types of Bender performance. 


RESULTS 


Contrary to expectancy, the intercorrelation matrix failed to reveal a large 
group factor which might correspond to “schizophrenia.”” A conventional cluster 
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analysis made it rather easy to recognize four relatively indevendent clusters. 
Simultaneous multiple-group centroid extraction of these four factors produced the 
oblique factor matrix reported in Table 1. 





TaBLe 1. Opsiiqus Factor Matrix or Typres or ScHIzOPHRENIC BENDER PERFORMANCE* 





Factor 


A B Cc D 
Chronic Disorganized Conforming and Actively 
Undifferentiated Non-defensive Defensive 
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76 00 
63 31 
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*Decimal points omitted. 


The original communality estimates accounted for somewhat more than half 
of the total variance of the individuals, and the final factoring accounted for 83 per- 
cent of this originally estimated communality. The intercorrelations of the oblique 
factors were as follows: .3 for A with B, .3 for A with C, .2 for A with D, .2 for B with 
C, .1 for B with D and .4 for C with D. 


DISCUSSION 


The fairly high communality among the subjects of the study supports the 
arguments, first, that the Bender-Gestalt validly represents the individual’s psy- 
chiatric features, and second, that it is a helpful diagnostic instrument for grouping 
schizophrenics. Kraepelin’s idea of a general disorder of “schizophrenia” receives 
relatively little support from Table 1, inasmuch as the factors show quite variable 
loadings for the different individuals and no particularly large group-factor cor- 
responding to ‘“‘schizophrenia.””’ While the objective scoring system, coupled with 
the interpretive scheme, does not adequately describe any individual schizophrenic, 
the major portion of his total performance can be evaluated in this gross fashion. 
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Three of the factors from this study correspond to earlier findings. A previous 
transposed analysis“? of schizophrenics based ‘upon psychiatric symptoms revealed 
the three types which correspond well with A, B, and D of the present study. They 
were, respectively, confused-withdrawn, hebephrenic silliness, and paranoid. Jenkins 
and others“: ®) have presented a theoretical conceptual approach to schizophrenia 
based upon analysis of various data. Their three types correspond closely to those 
mentioned above and are called, withdrawn, disorganized, and with psychotic reorgan- 
ization. The fourth factor in the present study, the conforming and non-defensive 
schizophrenic, may often be poorly represented in samplings, thus, failing to appear 
in the previous reports. Such subjects are often regarded as in sufficient remission 
so as not to be included in ‘‘representative samples of schizophrenia,” and would not 
ordinarily appear in samples of the admission wards. 


The Chronic Undifferentiated Schizophrenic (A) 


This type of patient attempts to conform (long response time, erasure present), 
but some minor inaccuracies enter (center dots of 3 not level, poor angulation in 7, 
poor juxtaposition of elements in 4). He seems to be the well- -preserved chronic un- 
differentiated schizophrenic who does not engage in very bizarre behavior or thought 
but is definitely seclusive. His everyday adjustment is impaired but not to the extent 
that he is cut off from reality. He is capable of making a good institutional adjust- 
ment. 


The Disorganized Schizophrenic (B) 


This type of individual is characterized by disorganized and confused perform- 
ance (collision, nonclosure of elements) as well as generally defective performance 
(loss of angles in 3, wrong horizontal crossing point in 6, substitutions, number dis- 
tortion). Productions are bizarre, and the gestalten often hardly recognizable. 


Performance is clearly inferior to that of the other types of individuals. Psychiatric- 
ally, he appears to be the hebephrenic with considerable scattering of thought. He 
is unstable and makes a poor hospital adjustment. 


The Conforming and Non-defensive Schizophrenic (C) 


This type of individual shows some rather gross disorganization features (non- 
closure of elements, near-collision, rotation of A). However, distortions are less im- 
portant and less frequent than for the B type. He attempts to conform (restroking 
present), and his general integration, though weak, is retained. Superficially this 
appears to be a milder form of the B type, but further study shows that, unlike the 
other types, he has established ties with the environment, though at a superficial 
level, and seems to lack defenses against relationships to things and people. Despite 
some disorganized features, he appears to be a stable chronic schizophrenic who has 
developed schizophrenic solutions for most of his inner conflicts. Though dilapidated, 
he makes a good hospital adjustment. 


The Actively Defensive Schizophrenic (D) 


This type of individual is sufficiently self-critical and in good contact (restroking 
present) so that few distortions are present. Defensés seem to be strongly active, 
particularly restitutional symptoms of various kinds. While some of this type are 
grandiose, others are extremely withdrawn, and the chief psychiatric similarity 
among them is their intensity of internal struggle for adjustment. One gets the feel- 
ing that a major battle for balance is being fought within these individuals. This 
type can be expected to be episodically disturbed but not particularly deteriorated. 
Paranoid features are frequently encountered. 

In conclusion, it is felt that this transposed analysis of Bender-Gestalt perform- 
ance contributes to the available information on subtype groupings of schizophrenia. 
The present study included very few acutely ill subjects, and the derived factor- 
types might differ if an acutely ill sample were employed. 
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SUMMARY AND CONCLUSIONS 


1. In an attempt better to understand the classification of the schizophrenias 
and diagnostic features of the Bender-Gestalt Test, the performance of thirty-two 
male schizophrenics on the Bender-Gestalt was subjected to a transposed factor 
analysis. Ratings of psychiatric characteristics provided information about individ- 
uals with particular types of Bender performance. 


2. The factor analysis disclosed a large commonness among the Bender per- 
formances of the individuals in this study. Four types of schizophrenics were sug- 
gested. They are as follows: (A) Chronic Undifferentiated Schizophrenic, (B) Dis- 
organized Schizophrenic, (C) Conforming and Non-defensive Schizophrenic, and (D) 
Actively Defensive Schizophrenic. 


3. No general factor of “‘schizophrenia” appeared. The results of the study are 
discussed in relation to previously reported schizophrenic types. 
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THE DURATION OF THE THERAPEUTIC RELATIONSHIP AND 
THERAPISTS’ SUCCESSIVE JUDGMENTS OF PATIENTS’ MENTAL 
HEALTH! 


DOROTHY CLIFTON CONRAD 
Western Carolina College 


PROBLEM 


The optimal duration of psychotherapy is essentially indeterminate, with indica- 
tions for both long continuing and shorter term procedures. Men eligible for Vet- 
erans Administration treatment at the time of this study (1950-51) must be thought 
of as a group with chronic disabilities. Acute exacerbations do occur, and it is com- 
monly at such times that clinic treatment is sought. Any treatment which offers 
hope of some benefit for such a group requires at least a number of months of regular 
contact. The first interview is a crucial point. Very little assurance that a given 
patient will return for subsequent interviews can be offered the therapist. We need 
to increase our understanding of characteristics which differentiate patients who can 
be expected to continue in therapy for varying lengths of time. 

As soon as a patient begins his first interview with the therapist, that therapist 
becomes a part of the system of psychological pressures within which the patient is 
functioning. Upon the interaction of therapist and patient depends in large measure 
the duration and outcome of the patient’s stay in therapy. Explicit and implicit 
judgments are made by the therapist at this time and these are responded to by the 
patient. Therapists do not transfer patients to other therapists, and patients in this 
situation rarely request a transfer. Within this frame of reference, the question may 
be asked: ‘‘What judgments by the therapist, at the initial interview, are associated 


with the length of the patient’s stay in therapy?” There is, then the further question 
of, “How do these judgments vary during the course of therapy?”’ 


METHOD 


Patients included in this study were one hundred male veterans who entered 
psychotherapy at the Veterans Administration Mental Hygiene Clinic in San 
Francisco during or after July 1950. All therapists in the Clinic were asked to supply 
data on each new patient immediately after the first interview through use of a 
mental health check list. The first hundred admissions which met this requirement 
constituted the sample. Twenty-seven therapists participated, including five psy- 
chiatrists, six psychiatric residents, four psychiatric social workers, four student 
social workers, two clinical psychologists and six clinical psychology trainees. 

Data were obtained through the use of The Pattern of Living, a mental health 
checklist for men developed by the author. This checklist is based on a concept of 
mental health, presented in an earlier publication), that the broad functioning of 
the individual in his daily pattern of living can be described in terms of three princi- 
pal areas: positive mental health; social conformity; and behavior pathology. The 
checklist consists of 45 items; 16 for positive mental health; 12 for social conformity; 
and 17 for behavior pathology as presented in Fig. 1. 

The sample consisted of four groups of patients including 25 who came for only 
the initial therapeutic interview (Group A); 22 who had a total of 2-6 interviews, 
with a mean of 4.3 (Group B); 22 who had a mean of 12.1 interviews (Group C); and 
31 who were still in treatment after 8-12 months and had a mean of 14.4 interviews 
at their third rating (Group D). A second rating was called for after the first month; 
a third after four months; and a fourth when the patient had been in treatment ap- 
proximately eight to twelve months. 


1From the Veterans Administration Regional Office, San Francisco, Calif. The author wishes to 
acknowledge the generous and indispensable advice of Professor Robert C. Tryon, University of Cali- 
fornia, a for the Veterans Administration. The author is, however, solely responsible for all 
statements e. : 
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Fic. 1. Irems INcLuDEp IN “THE PaTTerN or Livina”’, A MentTAt HeEattu CuHeEck List. 


PositivE Mentat HEALTH 23. Associates with some voluntary groups: e.g., 


. Has a positive affective relationship with social activities, clubs. ee 
someone. . Accepts financial and legal responsibilities. 
xs . . . 95 . © 7 ; 
2. Finds enjoyment in something. = Engages in some recreation. - 
. Has some capacity to see himself as others 26. Accepts necessary directions from those in 
see him. authority. 
. Has some sense of humor. 27. Pays moderate attention to cleanliness, diet, 


5. Is learning something and expanding his field ete. 
of experience 28. Open conflict with law limited to minor and 
. Does something to promote the welfare of infrequent infractions. 
another. 
- ; ATHOLOG 
. Works with another for mutual benefit. ee ie 
3. Creates something. Pursues some activity 29. Characterized by overt hostility. 
for its own sake. 30. Characterized by extreme passivity. 
. Takes calculated risks for possible gains. 31. Devotes much time to strongly disliked 
. Responds to aesthetic stimuli. . activities. . wat ’ 
. Contemplates meanings and values. 32. Devotes much time to activities to which he 
Acts on personally accepted system of values. = unsuited. 
3. Self-evaluation compatible with reality. . Prefers his pathology to treatment. 


i 8 difficulties and attempts to solve Complains of 


5. Has insight into difficulties. . Affective disturbance. 
}. Concerns self with community and social 5. Addictions. 
issues. ; 36. Anxiety. 

‘ . Compulsions or phobias. 
SoctaL CONFORMITY . Dissociative experiences. 
17. Is self-supporting. 39. Sexual inadequacies. 
18. Supports wife and children. . Social inadequacies. 
19. Is married or planning to be. . Somatic disturbances. 
20. Has a vocation, as distinct from “‘just a job.” 2. Speech disorder. 
21. Engages in social and/or sexual activities . Thinking disturbance. 

with women. . Vocational inadequacies. 
22. Initiates social contacts. 5. Other. 


Product-moment correlations of total and part scores for Groups C and D 
combined were computed between the first and second ratings and between the 
second and third ratings. Six correlations were thus available for each pair of ratings. 
Values for the first pair were .51 for behavior pathology, .84 for social conformity, 
.71 for positive mental health, and .78 for the total. When the second ratings were 
compared with the third, the values ranged from .80 for social conformity to .89 
for the positive mental health score combined with the social conformity score. 


RESULTS 


Table 1 summarizes the findings in terms of mean scores for each group at each 
rating. Two sets of scores are presented: mean composite scores, obtained by allow- 
ing two points for a plus score, one point for a question mark and no points for a 
minus; and, second, the mean number of plus scores. 

Inspection of Table 1 indicates that variability of mean scores for each of the 
categories on the checklist was small at the initial rating for these groups of patients. 
The reflected composite scores for psychopathology (obtained by subtracting the 
obtained score from the total possible score in order to give a positive valuation com- 
parable with the other categories) were virtually identical for all four groups. The 
checklist was constructed with the aim of discriminating among the male population 
in general, and not as a method of differentiating among patient groups. The pre- 
sence of pathology is, furthermore, the selective factor which brings these men into 
the clinic. Hence, it was presumed that all male out-patients would have somewhat 
similar scores. 


Comparisons of Mean Scores. Group A, in its single rating, was distinguished by 
having a higher mean composite score for social conformity than any other group, 
except Group C at the third rating. Group A, whose members left after a single 
therapeutic interview, also had the lowest mean score for positive mental health 
at this single rating. 
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TaBLE 1. Mean Composire Scores AND MEAN NuMBER OF PLUS Scores FoR Eacu Group AT 
Eacu MEETING 








Positive Social Pathology 
Total Mental Health Conformity (Reflected) 


Group N | MCS* MPS** | MCS MPS MPS | MCS MPS 


First Rating 

A 25 | 56. : ‘i i 1.6 : 22.0 
B 22 | Sb. ; f , : ‘ 22.4 
C 22 | 56. : : : 5.! . 22.0 
D 31 58. 8. 20. : 5. i 22.4 











Second Rating 
B 22 


Cc 15 
D 30 


Third Rating 
C 19 
D 31 








Fourth Rating 
D 28 9.0 

















*Mean Composite Score 
**Mean Plus Score 


Group B differed from the others by showing the lowest composite score for 
social conformity. Group C was not distinguished in any way at the first rating. 
Group D, which was to remain in treatment longer than any others, had the highest 
initial positive mental health score. 

Direct comparisons of the groups are important for arriving at the meanings 
of the scores. When Group A is compared with Group D at the first rating, there is a 
definite tendency for Group D to exceed in plus and composite scores for positive 
mental health, as well as in plus scores for psychopathology, but not for social con- 
formity. The comparison of Group A with Group B at the initial rating is especially 
marked by the more frequent occurrence of social conformity items in Group A and 
by a slight excess of positive mental health scores for B. Group C was not clearly 
distinguished from any other group. Group D tended strongly to exceed B in 
all categories at the first rating. 

Self-comparisons of the groups had the following results. The two ratings for 
Group B paralleled each other very closely, with some excess of plus scores for psycho- 
pathology. This was also true of the first two ratings for Groups C and D. It should 
be noted that, just before its departure from treatment, on the third rating, Group 
C achieved the highest mean total composite score recorded, as well as the highest 
score for social conformity. The third rating for D showed a general increase in all 
categories. 

These data suggest that scores on the first rating have the following meanings: 
(a) Psychopathology noted at the first interview does not offer a basis for prediction 
of length of stay in psychotherapy. (b) High scores on social conformity in combina- 
tion with low scores on positive mental health will be associated with immediate re- 
jection of psychotherapy. (c) Low scores on social conformity will be associated 
with a tentative testing out of therapy and a failure of continued attendance. (d) 
High scores on positive mental health will be associated with long time continuation 
in treatment. 

The combined sequences vu. . «tings suggest the following meanings. (a) Persist- 
ence in therapy will be associated with increasing scores in positive mental health. 
(b) When positive mental health and social conformity increase together, the patient 
is likely to discontinue therapy. (c) When positive mental health is high, social 
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conformity low and pathology increasing, the patient tends to stay in therapy. It 
should be remembered that all these statements refer to judgments by the therapists 
as recorded on The Pattern of Living checklist and may or may not be accurate 
descriptions of the patients themselves. 


Differences between Mean Positive Mental Health and Psychopathology Scores. A 
rather obvious anticipation is that the ability to make use of the opportunity for 
therapy would be reflected by the balance between scores on positive mental health 
and those on psychopathology: that a heavier weighting for positive mental health 
would be associated with continuation in therapy. In this particular set of data, all 
mean differences between the two sets of scores were in favor of positive mental 
health. Such a general statistical finding is, of course, largely a function of the con- 
struction of the checklist, which made it probable that scores for psychopathology 
would be lower. 

Comparative findings are more revealing. At the first rating, Group A had the 
smallest difference and there was a gradual increase in the groups, corresponding 
with the length of time the groups were to stay in therapy. By the second rating, 
at the end of the first month, difference scores for Groups B and C were quite close, 
but had dropped to even less than that for A at the first rating. Group D was close 
to its own first rating and definitely exceeded the other two groups. By the third 
rating, after four months of treatment, both C and D had increased decidedly, with 
C more than doubling its previous rating and reaching the highest difference score 
of all. At this point Group C left treatment. Group D had the highest mean plus 
score at the beginning and deviated least from it throughout. One implication to be 
noted is that for all groups, the end of the first month seems to represent 4 low point 
in the therapists’ judgments. It appears, furthermore, that Group C, which stayed 
about four months, may be the group which profited most from treatment and which 
was least understood. This seeming paradox needs clarification. The group which 


had the greatest positive difference between positive mental health and psycho- 
pathology plus scores throughout, Group D, was the one which remained in therapy 
longest. 


Item Study. An item study was made by sorting out those items which were given 
plus scores for two-thirds or more of each group and those items which were given 
plus scores for less than one-third of each group. Certain items were found to be 
characteristic of all groups at all ratings. These were: ‘Pays moderate attention to 
cleanliness, diet, etc.”” and “Complains of anxiety.” Low frequencies were con- 
sistently observed for: ‘‘Concerns self with community and social issues”, ‘‘Devotes 
much time to strongly disliked activities’, and “Complains of dissociative exper- 
iences.”’ 

At its single rating, Group A also had high frequencies for “Has a positive 
affective relationship”, “Is self-supporting”’, “‘Accepts financial and legal responsi- 
bilities”, and ‘“‘Complains of somatic disturbances’’. 

On their two ratings, therapists consistently gave Group B a high frequency of 
plus ratings for “Open conflict with the law limited to minor and infrequent in- 
fractions.”’ At the first rating they were seen as high in ‘Recognizes difficulties and 
attempts to solve them.”” When the therapists rated them the second time this item 
had decreased in frequency and was replaced by “Finds enjoyment in something” 
and ‘‘Acts on a personally accepted system of values’’. 

Group C at the first rating appeared very much as did Group B at the second. 
But, in addition, there was emphasis on “Is self-supporting”, and on “Complains of 
affective disturbance’. By the second rating for C, the therapists gave high plus 
scores for many more items than have been noted up to this time. In the positive 
mental health area, these included: ‘Has a positive affective relationship;” ‘Finds 
enjoyment in something;’” ‘Has some sense of humor;” “Does something to pro- 
mote the welfare of another;’’ ““Contemplates meanings and values;’’ and “‘Acts on 
a personally accepted system of values.” Among the social conformity items, there 
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were high frequencies for “Is self-supporting;”’ ‘Engages in social and/or sexual 
activities with women;” ‘Accepts financial and legal responsibilities;” “Accepts 
necessary directions from those in authority;”’ and “Open conflict with the law 
limited to minor and infrequent infractions.”” There were no outstanding items 
under behavior Pathology. Group C’s third and last ratings again showed a relative- 
ly larger number of items with high frequencies. Seven of the original eleven high 
frequency items were retained and two “Is learning something and expanding his 
field of experience’”’ and ‘‘Responds to aesthetic stimuli’ were added. Again there 
were no outstanding items under behavior pathology. 

Group D was still in treatment at the time the last data were gathered, having 
remained at least 8-12 months. The therapists had noted certain characteristics 
at the beginning which continued to distinguish these patients throughout. In the 
Positive Mental Health area these were “Has some sense of humor;” and ‘‘Con- 
templates meanings and values.” 


SUMMARY 


The following hypotheses are offered for further testing. 

1. The patient who is judged by his therapist to have a relatively high degree 
of positive mental health at the beginning of treatment will tend to remain in treat- 
ment longer than others. 

2. The patient who is initially judged by his therapist to have a relatively low 
degree of social conformity will tend to try out therapy briefly before rejecting it. 

3. The patient who is initially judged by his therapist to have a relatively 
high degree of social conformity will tend to reject therapy at once. 

4. Behavior pathology ratings will not vary systematically with duration of 
therapy. 

5. The patient who stays longer in therapy will tend to have a greater positive 
discrepancy between positive mental health ratings and behavior pathology ratings. 

6. Positive mental health scores and social conformity scores will tend to in- 
crease with the duration of therapy. 

7. When a therapist judges a patient (a) to have a sense or humor, or (b) to 
be contemplative about meanings and values, the patient is more likely to stay in 
treatment. 
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A NOTE ON ATTEMPTED EVALUATIONS OF PSYCHOTHERAPY 
RICHARD DE CHARMS, JEROME LEVY AND MICHAEL WERTHEIMER 
Wesleyan University 


An examination of the literature on the results of psychotherapy reveals very 
diverse results. It would seem that the bulk of the literature can be used for only 
one purpose: to find out what not to do when undertaking a study of this kind. 
However, recent research in this field is beginning to profit from the mistakes of the 
past and some of the current investigations“: 5 ® %. 5. 1©) show great promise. 

Eysenck“: »P. 5-4) has made a valuable contribution by pointing up the 
difficulties involved in drawing conclusions from existing investigations. He re- 
viewed the literature and compared the results reported by psychoanalytic and 
eclectic therapists with reports of spontaneous remission from institutions which 
presumably utilized little or no psycho-therapeutic treatment. This comparison 
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led him to conclude that “the figures fail to support the hypothesis that psycho- 
therapy facilitates recovery from neurotic disorder” ®. >. *%), Eysenck states that 
no further conclusions are warranted because of the difficulties encountered in com- 
paring results of diverse studies that are not specifically designed for comparison 
purposes. 

Although we are in agreement with a conclusion which states that there are no 
figures to support the hypothesis that psychotherapy is beneficial in treating neurotic 
disorders—with the important addition that there are also no figures to support the 
hypothesis that psychotherapy is not beneficial—we must take issue with Eysenck’s 
specific conclusion. He states that the figures fail to support the positive hypothesis 
that therapy is of benefit, and the reader might feel that by implication they sup- 
port the negative hy pothesis or the null hypothesis. It is our contention that the 
figures used by Eysenck are too unreliable to be used in connection with any hypo- 
thesis, positive or negative, and that no conclusion of any kind should be drawn from 
them. This modification may seem trivial, since Eysenck himself deplores the unre- 
liability of the data he uses, but at the present time when the need is so great for 
reliable information, a misleading conclusion may be dangerous. 

The essence of Eysenck’s argument may be summed up as follows. (a) The best 
estimate of the spontaneous remission rate which can be obtained from existing data 
is that approximately two-thirds of the patients recover or improve without the 
benefit of psychotherapy. (b) The best estimate of the improvement resulting from 
psychotherapy is that about two-thirds of the patients treated improved. (c) There- 
fore, the figures fail to prove that psychotherapy is of value to the patient. These 
three points will now be discussed in order. 


(a) Admitting many of the shortcomings of the existing data, Eysenck states 
that it is still possible to “‘conclude with some confidence that our estimate of some 
two-thirds of severe neurotics showing recovery or considerable improvement with- 


out the benefit of systematic psychotherapy is not likely to be very far out” ® ». #2, 
This conclusion is reached after reviewing the material presented by Landis™ and 
Denker, and is based on figures involving two year follow-ups. Landis and Denker 
agree on a figure of 72% for spontaneous remission. 

Eysenck emphasizes the fact that despite differences in the studies ‘“Denker’s 
figure agrees exactly with that given by Landis’’®: »- °°, We feel that the coinci- 
dence is the less reliable because of the inherent differences in the studies, which 
are themselves more unreliable than one would infer from Eysenck’s discussion of 
them.’ The probability is that the hospitals which supplied the statistics for Landis’ 
study had greatly varying criteria of cure, while Denker’s criteria were quite definite. 
In view of this, and since Denker’s cases have not been shown to be as severe as 
Landis’, the data become extremely difficult to compare and interpret. We must 
consider the possible influence of other uncontrolled variables such as the following: 
(a) maturation alone may have been responsible for cure in some cases; (b) somatic 
etiology may have been involved in some cases (perhaps some of the individuals in 
Denker’s case studies had “‘neuroses’’ which constituted only “benign nervous states”’ 
“)); (c) hospital confinement and treatment may themselves be therapeutic; (d) 
Denker’s patients received some of the elements of psychotherapy from their own 
physicians, as intimated in his article; and (e) some of Landis’ group did receive 
psychotherapy. All these factors combine to prevent the valid use of the statistics 
included in these studies for an evaluation of psychotherapy. 

Eysenck also states that these results are typical and that they are “remark- 
ably stable from one investigation to another’® »- %). This statement is 
questionable in view of reports of five year follow-ups such as (a) that of Friess and 
Nelson“), where one may interpret the results, as did Wilder“, to mean that 20% 
is the spontaneous remission rate, and (b) that of Denker®?, where 90% is reported 
as the spontaneous remission rate for a five year follow-up. If these two studies 
differ so widely, it appears that existing figures for spontaneous remission rates are 
not at all consistent. Although Eysenck used a two year base, we see no reason why 
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a five year base may not be taken in comparing two studies, especially since we found 


no other studies utilizing a two year follow-up with which to check Eysenck’s claim 
of stability. 


(b) Nineteen studies reported in the literature are incorporated by Eysenck 
into a composite table »- *), The exact criteria used in their selection are not 
clear, although it is stated that many articles were not included “‘because of such 
factors as excessive inadequacy of follow-up, partial duplication of cases with others 
included in our table, failure to indicate type of treatment used, and other reasons 
which made the results useless from our point of view” ®: »- #2), The table is broken 
down into four categories running from “cured” to “not improved” as judged by 
Eysenck from the results reported in the various studies. From the totals of the 
table, the figure of slightly less than two-thirds is derived for improvement resulting 
from psychotherapy. (Psychoanalysis yielded 44% improvement if patients who 
stopped treatment are classed as not improved; eclectic therapy yielded 64%.) 
Eysenck states that the figure reached by means of his table is ‘stable from one 
investigation to another’’®. »- 8), but it seems that the table itself casts doubt on 
this statement. The percentage of improvement in the different studies summarized 
actually ranges from 39% to 77%. 

The studies themselves really prove nothing regarding the effects of psycho- 
therapy, just as Landis and Denker prove little regarding spontaneous remission 
rates. Diagnostic criteria were not carefully established; techniques differed; the 
therapists differed in experience; the environmental setting in which therapy was 
given varied. Criteria of cure varied; the possibility of somatic etiology was not 
excluded; and there was no equating for spontaneous remission or for the effect of 
maturation; i.e. there was no adequate control group. 


(c) Since the proportion of spontaneous remission is set at approximately 


two-thirds over a two year period, and the figure for therapy is also set at about 
two-thirds, Eysenck concludes that these data “fail to prove that psychotherapy, 
Freudian or otherwise, facilitates the recovery of neurotic patients’ ® »- *), If our 
review of Eysenck’s arguments presented above invalidates the use of the figure 
“two-thirds” for the beneficial results of psychotherapy or spontaneous remission, 
then this further conclusion, dependent upon his first two premises, is not valid. 

The faults inherent in the studies, and the dangers involved in comparing 
them, seem to us to preclude the possibility of drawing any valid scientific conclu- 
sions based on the figures culled from these investigations. We feel that the only 
conclusion which can be made at this point is that we have as yet no data on the 
basis of which to evaluate the therapeutic effects of psychotherapy. 
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A PRELIMINARY STUDY OF FRUSTRATION REACTIONS 
OF THE POST-POLIOMYELITIC 


LEONARD V. WENDLAND 


Rancho Los Amigos Respiratory Center for Poliomyelitis 
Hondo, California 


PROBLEM 


The purpose of this paper is to report: (a) The reactions of a group of post- 
poliomyelitic subjects to frustrating situations as reflected in their responses on the 
Rosenzweig Picture Frustration Study, (b) a comparison with the Rosenzweig 
P-F Study normative data, and (c) to determine whether or not there is a significant 
difference in the responses made on the Rosenzweig P-F Study by subjects with 
varying degrees of residual involvement. 


PROCEDURE 


Eighty-two non-hospitalized post-poliomyelitic subjects were used in this 
study, 29 males and 53 females. This group of subjects has been used previously in 
other papers“: 4, ©. Each of the 82 subjects was given the Rosenzweig Picture- 
Frustration Study. The responses were checked by the author and one other person 
using the Revised Scoring Manual for the Rosenzweig Picture-Frustration Study ®?. 
Table 1 gives the age distributions for the subjects used in this study. Of the females, 
approximately 34 per cent are single and 66 per cent are married. Approximately 25 
per cent of the males are single and 75 per cent are married. In comparing the 
marital status of these male and female subjects with that of the general Los 


TABLE 1. AcE DISTRIBUTION AND EMPLOYMENT STaTws oF Post-PoLIOMYELITIC 
Sussects Usep 1n Tuis Stupy 








Age Steps Females Males Totals 
N ~ a 


4 4 





25-29 years 10 25 
30-34 years 19 
35-39 years 27 
40-44 years é E 10 


45-49 years 








Occupational 
Classification 





Professional 
Semi-Professional 
Sales 

Clerical 

Service 

Agricultural Services 
Skilled Labor 
Semi-skilled Labor 
Unskilled 
Unclassified 
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Angeles population, it is found that « slightly higher percentage are married “: P- 47), 
The mean age for the female subjects is 33.6 years and that for the male subjects is 
33.4 years, a total mean age of 33.5 years. The occupational status of these subjects 
is indicated in Table 1. While no statement is made in this paper concerning the 
earnings at present, the earning status of these subjects is favorable as may be seen 
in a arg paper published by the author“ also reporting other detailed descrip- 
tive data. 

Inasmuch as Rosenzweig has published data on a group of 460 normal subjects, 
his normative group was used as a basis for comparison in this study“. Male and 
female subjects were contrasted with Rosenzweig’s male and female subjects separ- 
ately, then the total post-poliomyelitic group was compared with Rosenzweig’s total 
group. The post-poliomyelitic subjects were then divided in terms of the degree of 
residual paralysis at the present time. The five subgroups were then compared in an 
attempt to determine whether or not their reactions on the Rosenzweig Picture- 
Frustration Study differed significantly. 


RESULTS AND PsyCHODYNAMICS 


Comparison of Post-poliomyelitics with Normals. Table 2 presents the results of 
the comparison of the post-poliomyelitics with Rosenzweig’s normative group, for 
the sexes separately and then for the total group. Inspection indicates some sig- 


TABLE 2. CoMPARISON OF Post-POLIOMYELITICS WITH ROSENZWEIG’S NORMATIVE GROUP 








Category 


Normative Group 


Male N = 236 
Female N = 224 


Post-Polio 


Male N = 29 
Female N = 53 


Difference 
between 
Means 





8. D. 


Male 13.3 
Female 13.1 
Total 13.2 


8. D. 
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11.91 


Mean 
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nificant differences in the responses of the post-poliomyelitic with the normative 
group used by Rosenzweig. The post-poliomyelitic group is less Extrapunitive than 
the normative group. The females are much more Intropunitive than the females 
of the normative group, but the males in this study are not significantly different 
from the normative group. The males are, however, significantly more Impunitive 
than the normative group, while the females in this study are not significantly 
different in Impunitive responses from the females of the normative group. 


It is also apparent in Table 2 that while the male subjects do not differ sig- 
nificantly in the O-D dimension, the female group does differ with the normative 
group at the 5% level of significance. As a group, however, the post-poliomyelitic 
subjects do differ significantly with Rosenzweig’s normative group in the direction 
of expressing less sensitivity to blocking by frustrations. This is significantly true of 
the females. The female post-poliomyelitic is more aware of frustrations than the 
male post-poliomyelitic who does not differ significantly from the males of Rosenz- 
weig’s normative group. The post-poliomyelitic males’ need for ego-defense is sig- 
nificantly less than the males of the normative group, and less than the females of 
the post-poliomyelitic group. This would seem to be consistent with the fact men- 
tioned above, that the post-poliomyelitic male tends to be significantly more Im- 
punitive that the male of Rosenzweig’s normative Involvement. 


Comparison of Subjects with Varying Degrees of Residual Involvement. The 
post-poliomyelitic subjects of this study were divided into five subgroups in terms 
of the apparent degree of residual paralysis at the present time. “O” apparency 
represents an individual who has no apparent deformity while “4’’ means that the 
subject is so severely involved that he or she is incapable of major movements even 
with the aid of prosthetic devices. ‘1’, “2”, and “3” represent qualitative differ- 
ences between the extremes of ‘‘O”’ and ‘4’. Each of the five subgroups differed 
significantly on the Rosenzweig Picture Frustration Study. Table 3 gives the means 
and standard deviations of subjects as compared one with another on the various 
factors used in the Rosenzweig Picture Frustration Study. 


Table 4 gives the statistically significant results of the inter-apparency relation- 
ships. When “O” is paired with a ‘1’, it means that the subjects with “O” ap- 
parency have a larger score than do the group with a “1” degree of apparency. It 
becomes apparent by looking at this table that there are not as many significant 
intergroup differences as one might anticipate. Table 4 lists only those relationships 
which are statistically significant, showing those relationships at the 1 per cent and 
5 per cent level of significance. 


TABLE 4. SraTIsTICALLy SIGNIFICANT SUBGROUP RELATIONSHIPS 








Picture-Frustration Test 1% level 5% level 
Category Mal Female e 


Female 





Intropunitive (I) 3—1 
3—4 


Impunitive (M) o—1 
Obstacle-Dominance (O-D) 


Need-Persistence (N-P) 


Group Conformity 
G.C.R.) Rating 














LEONARD V. WENDLAND 


SUMMARY AND CONCLUSIONS 


In this paper we have compared a group of 82 post-poliomyelitic subjects on the 
Rosenzweig P-F Study with Rosenzweig’s normative group. Some differences be- 
tween the post-poliomyelitic group and the Rosenzweig normative group were found 
at statistically significant levels which at best indicate trends because of the size of 
the sample. The major findings of the study may be summarized as follows: 


(a) When the post-poliomyelitic male is compared with Rosenzweig’s norma- 
tive group he may be described as one who is less Extrapunitive, more Intropunitive 
and more Impunitive than the male of the normative group. He may be further 
described as one who does not differ significantly in his reactions to obstacles causing 
frustration. However, he is somewhat less Ego-Defensive but has a greater need 
to find the solution to problems of frustration. 


(b) The female post-poliomyelitic, when compared with the Rosenzweig 
female norms, may be described as a person who is less Extrapunitive, more Intro- 
punitive but not significantly different with reference to Impunitiveness. Further- 
more, she may be described as one who is significantly more aware of the obstacles 
causing the frustrations and one who has a greater need to find the solution to frust- 
rating problems. She does differ significantly with reference to her Ego-Defense 
system. 


(c) Comparing the male and female post-poliomyelitic, we may say that 
differences are found at the following points: (1) The male is less Intropunitive while 
the female is less Impunitive; (2) the male is less aware of the obstacles causing 
frustrations, but (3) the female is more Ego-Defensive when encountering frustra- 
tions. 


(d) Females with serious involvement are more Intropunitive than those with 
minimal involvement. Females with no apparent residual paralysis are also more 


Impunitive than those with minimal involvement. Females with involvement rated 
as ‘‘2’”’ are more aware of the obstacles occasioning the frustration than those with 
no apparent residual. However, females with no apparent residual have a greater 
need to overcome the frustrating obstacle than those with very serious involvement. 
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HUMAN FIGURE DRAWINGS BY MENTALLY RETARDED MALES* 


MANFRED F. DE MARTINO 
Southbury Training School, Connecticut 


PROBLEM 


Of the more recently developed projective techniques, human figure drawings 
appear to be particularly useful in psychodiagnostic work with the mentally re- 
tarded. In addition to not requiring any verbal response on the part of the partici- 
pant, as in many of the more widely utilized projective devices, they are easily ob- 
tained and can be successfully produced by individuals of relatively low intelligence. 
At present, the use of human figure drawings is limited by the comparative lack of 
interpretive validity. Many of the interpretations presented in the literature seem 
to have been derived primarily on a clinical and empirical basis. While such ap- 
proaches have much to offer, the efficacy of figure drawings could be greatly en- 
hanced through the establishing of experimental validation. Several recent studies 
(3, 5, U, 2) have, in part, made contributions toward this end. It is hoped that the 
present exploratory study, which is principally concerned with an investigation of 
male figure drawings produced by institutionalized mentally retarded non-homo- 
sexual and homosexual males will further aid in the establishing of interpretive 
validity. 

MeEtTHOD 


In the first part of this investigation, human figure drawings were obtained 
from 100 mentally retarded males (known and suspected homosexuals were not 
included) at the Southbury Training School ranging in age from 11 to 36 years. The 
mean age of the group was 19.5. Their IQ’s ranged from 44 to 78 with a mean of 
63.4. All IQ’s were based on results from the Wechsler-Bellevue (Forms I and II), 
the Wechsler Intelligence Scale for Children, and Revised Stanford-Binet (Forms L 
and M). Approximately half of the drawings were collected by the Southbury Train- 
ing School teachers and the remainder by the writer. During class periods, on sheets 
of unlined paper 814” x 11”, various teachers asked their pupils to “Draw a whole 
person. Draw all of the person.”” (This phraseology was used as it seemed most 
feasible.) The teachers were instructed only to allow the use of a lead pencil, not to 
mention any specific sex or person, and to answer all questions in a nondirective 
fashion. After the drawings were completed, the subjects were asked to graphically 
indicate the sex of their productions and to write their names on the back of the 
drawings. In those cases where the individual was unable to write, this was done 
by the teacher. (The relatively small size of the classes at Southbury enabled the 
teachers to keep in close contact with the children throughout the experiment.) 
The data were then given to the writer. Several weeks later the same general pro- 
cedure described above was followed, except that the subjects were asked to ‘“Draw 
a whole man. Draw all of the man.” (These latter drawings were the ones used in 
analyzing the characteristics referred to below.) This approach was necessary in 
order to determine which sex was drawn first, as well as to get a drawing of a male 
figure from all subjects. Data collected by the writer were obtained during personal 
interviews. On sheets of unlined paper 8144” x 11’’, each subject was asked to ‘“‘Draw 
a whole person. Draw all of the person.’’ While the participant was in the process of 
drawing, the writer remained a distance away so as to minimize feelings of anxiety 
or embarrassment. After the drawing was completed, the subject was asked to in- 
dicate whether it was a male or female. In those instances in which a female figure 
was drawn first, the subject was given another sheet of paper and instructed as 
follows: ‘“Now draw a whole man. Draw all of the man.” 


*The writer wishes to thank Dr. C. Edward Stull, Director of Psychological Service at the South- 
bury Training School, Connecticut, for his invaluable aid in the statistical computations and general 
experimental design. 
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The second part of this study was mainly devoted to a comparison of male 
figure drawings by 37 mentally retarded homosexual males with those produced by 
a control group of 37 mentally retarded non-homosexual males. Included in the 
homosexual group were only individuals who were definitely known to have engaged 
in overt homosexual behavior on at least several occasions and who were 15 years of 
age or older. In each instance, verification of homosexuality was confirmed by either 
the cottage attendant in charge or the assistant supervisor of boys. Case history 
records were also utilized for corroborative evidence. The age range of the homo- 
sexuals was from 15 to 35, with a mean of 23.9. Their 1Q’s ranged from 46 to 77, 
with a mean of 62.7. The control group of 37 non-homosexuals was selected from the 
100 subjects described above on the basis of obtaining similar mean groupings of 
CA’s and IQ’s and comparable etiological groupings. (About 70% of the subjects 
in each group were ‘“Familials’’.) The age range of this group was from 15 to 36, 
with a mean of 24.0. Their IQ’s ranged from 46 to 77, with a mean of 63.0. Statis- 
tical tests did not reveal any significant differences between the means of these 
groups. Data from the homosexual group were collected by the writer during per- 
sonal interviews following the method described above. 

After having noted which sex was drawn first by the various participants, the 
male figure drawings were analyzed in terms of the characteristics listed below. The 
findings based on drawings by homosexuals were then compared with those of the 
control group and levels of confidence were computed according to Fisher’s“ 
Exact Treatment of 2 x 2 Tables. 


Characteristics Analyzed* 


Head: front view or profile. 

Location on page: determined by dividing paper into nine equal parts and categorized 
as to location of major portion of figure. 

Figure small in size: less than 14 of page in terms of width and length. 

Figure large in size: more than 14 of page in terms of width and/or length. 
Large head: more than \% the size of trunk in terms of width and/or length. 
Small head: less than 14 the size of trunk in terms of width and length. 

Hair on head: presence, absence. 

Eyes represented by: circles, dots, crosses, dashes or curves. 

Absence of eyes. 

Eyelashes: presence, absence. 

Eyebrows: presence, absence. 

Ears: presence, absence. 

Nose: (no special formation): presence, absence. 

Mouth open: no middle line to indicate separate lips. 

Mouth closed: ~~ ‘rea by single line or middle line to indicate separate lips. 
Absence of mouth. 

Object in mouth: cigarette, pipe, etc. 

Teeth: presence, absence. 

Full or sensuous lips: female contours. 

Mustache or beard: presence, absence. 

Neck: presence, absence. 

Arms: presence, absence. 

Arms held at a distance from the body: arms hanging at an angle away from the vertical. 
Arms perpendicular to body: arms at right angles to body. 

Arms in front of body. 

Arms directed upward from shoulder area. 

— held rigidly to side: no space intervening between the arms and the line of the 
v0dy. 

Hands: (an approximate structure): presence, absence. 

Fingers: presence, absence. 

Emphasis on genital area: presence, absence. 

Shading of waist area, arms, legs, etc. 

Delineation of male breast or hair on chest. 

Legs: presence, absence. 

Feet or shoes (no special formation): presence, absence. 

High heels (markedly higher than ordinarily seen on men’s shoes): presence, absence. 
Shoelaces: presence, absence. 

Articles of clothing: hat, tie, belt, buttons, shirt, etc. 

Position of figure: standing, sitting, lying. 

General proportions of figure: good, poor. 


“These characteristics in part were designed after those used in a study by Holzberg and Wexler © 


a 
2. 
3. 
4. 
5. 
6. 
7. 
8. 
9. 
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RESULTS 


Table 1 contains those characteristics which were present in 20% or more of 
the drawings by 100 mentally retarded non-homosexual males. An examination of 
this table reveals the following: (a) In 74% or more of the cases the figure was 
portrayed in a standing position; the general proportions were poor; legs, arms, a 
nose, and feet or shoes, were present; the head was large; a male figure was drawn 
first, with the head in front view; the mouth was represented as open; and hair on 
head was present. (b) In between 48 and 67% (inclusive) of the cases the figure was 
large in size; fingers, ears, eyebrows and buttons were present; eyes were represented 
by dots; and the arms were held at a distance from the body. (c) In between 20 and 
40% (inclusive) of the cases the figure was small in size and was placed in the upper 
left portion of the page; eyes were represented by circles; a neck, hat, hands, belt, 
and teeth were present; the mouth was represented as closed; and the arms were 
represented as perpendicular to the body. 


TaBLE 1. AN ANALYSIS OF FicuRE Drawincs By 100 MENTALLY RETARDED 
Non-Homosexvuat Ma.es* 








Per Cent Per Cent 
Characteristics Noted Characteristics Noted 





Standing position 99 Eyebrows 57 
Legs 99 Eyes represented by dots 53 
General proportions: poor 97 Arms held distance from body 48 
Arms 96 Buttons 48 
Nose 93 Eyes represented by circles 40 
Large head 91 Neck 39 
Feet or shoes 90 Figure small in size 37 
Male figure drawn first Location: Upper left of page 31 
Front view 88 Hat 26 
Mouth open 76 Hands 26 
Hair on head 74 Belt 
Fingers 67 Mouth closed 21 
Figure large in size 63 Teeth 21 
Ears 61 Arms perpendicular to body 21 
*Only those characteristics which were present in 20% or more of the drawings are included. 














The following characteristics were present in fewer than 20% of the drawings; 
female figure drawn first; head in profile; location on page: center, left center, lower 
center, upper center, upper right, lower right, lower left; small head; eyes represented 
by dashes or curves; eyelashes; full or sensual lips; pipe; cigarette; mustache; beard; 
male breasts; hair on chest; tie; shirt; emphasis on genital area; shading of legs; arms 
held rigidly at side; arms in front of body; arms held upward from shoulder area; 
shoe laces; high heels; and figure portrayed in a sitting position. 

In Table 2, P-values are presented for those drawing characteristics in which 
differences (between homosexuals and non-homosexuals) significant beyond the .05 


TABLE 2. SIGNIFICANT DIFFERENCES BETWEEN HOMOSEXUALS AND NON-HOMOSEXUALS 








Homosexuals Non-Homosexuals 

(N =37) (N =37) 

Number of Times Number of Times 
Characteristics Present Present 
Eyelashes 9 2 
High heels 8 1 


*P-values derived according to Fisher’s Exact Treatment of 2 x 2 Tables. 
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level of confidence were noted. From an examination of this table it can be seen 
that: (a) Eyelashes appeared significantly more in the drawings by homosexuals 
than in those by non-homosexuals. (b) High heels appeared significantly more in 
the drawings by homosexuals than in those by non-homosexuals. 


Discussion 

Except in a very few cases, the subjects responded with a minimum of re- 
luctance or hesitation. Those who expressed reluctance did so on the grounds that 
they were unable to draw well. Encouragement and support, however, quickly dis- 
pelled all such feelings. It is felt that in large measure the data presented herein 
can serve to aid in differentiating usual (frequent) from unusual (infrequent) re- 
sponses. Since unusual responses are generally the more revealing psychologically, 
this would tend to increase the clinical significance of certain drawing character- 
istics. Of particular interest were the findings in which drawings by homosexuals 
contained significantly more eyelashes and high heels than those by non-homosex- 
uals. (These characteristics, however, appeared in fewer than 25% of the drawings 
by homosexuals.) Machover has stated that, ‘“The homosexually inclined male. . . 
may give large eyes with lashes to the figure of the male, in combination with a well 
specified high heel.” ‘. ». 48) Levy asserts that, “If the eyes are very large and if 
those of the male figure have lashes, the subject is almost surely a homosexual.” 
(7, p. 278) Barker, et al.,“? however, did not sl any such trends. Moreover, 
whereas Machover? and Levy“ have implied that homosexuals tend to draw the 
opposite sex first, this was not found in the present investigation, nor was it ob- 
served in the study by Barker, et al“. In view of these various discrepancies, it is 
apparent that additional research is needed before conclusive statements can be 
made regarding human figure drawings by homosexuals. 


SUMMARY 


The primary purpose of this study was to investigate the nature of male figure 
drawings produced by 100 institutionalized mentally retarded non-homosexual 
males, and to compare the male figure drawings by 37 mentally retarded homosexual 
males with those produced by an equivalent group of non-homosexuals. A secondary 
purpose was to ascertain which sex was drawn first by the various subjects. 

All of the male figure drawings were analyzed in terms of frequency according 
to certain predetermined characteristics. Notable findings regarding the drawings 
by homosexuals were as follows: (1) Most drew their own sex first. (2) High heels and 
eyelashes appeared significantly more in their drawings than in those by non- 
homosexuals. 
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EXAMINER INFLUENCE ON THE RORSCHACH 


DAVID BERGER! 
Fort Custer, Michigan 


PROBLEM 


Most clinicians utilizing the Rorschach in their practices have noted the various 
influences occurring during the testing period which seemingly affect the perform- 
ance of their clients. With the accrual of these clinical observations“, and their 
experimental analogues: ® %. 1), it has become increasingly clear that Rorschach 
workers must develop greater sensitivity to the many influences permeating the test 
situation. If the test were constant and immutable, as it was once considered to be, 
it could be administered indiscriminately, without regard for the many subtle 
extranea occurring from situation to situation. It would then be possible to obtain 
a stable portrait of personality under almost any testing conditions. 

The present paper deals with one phase of the interpersonal contact in the 
testing situation, i.e., examiner variance. For our purposes we shall define examiner 
variance as that configuration of influence portrayed in the test performance at- 
tributable to the nature of the particular examiner-client interaction. The research 
on examiner variance relates to two basic questions. The first objective was to esti- 
mate the amount of variance in the client’s response to the test which resulted from 
the interaction between the client and the examiner®: 7: '5. ©. The second problem 
was concerned with the factors in the stimulus value of the examiner which contri- 
bute to these apparent influences“*). The present research was designed to investi- 
gate both problems. Two hypotheses were formulated: (a) significant differences 
would be found in the configuration of the scores on the total test secured by a num- 
ber of examiners; and, (b) an individual’s scores on the Rorschach test and his 
eliciting propensities as an examiner would be significantly correlated. 


METHOD 


As part of their internship training experience, eight graduate students from 
Michigan State College and the University of Michigan were assigned to Fort 
Custer VA Hospital sometime during the period 1947-50. The personal Rorschach 
record of each of these trainees was obtained? and rechecked for consistency of scor- 
ing. The records taken by the students assigned from Michigan State College were 
administered when these people enrolled in an introductory Rorschach course. The 
records of the students assigned from the University of Michigan were administered 
as part of the assessment program conducted at Ann Arbor®. It may be presumed 
that at the time these personal records were administered the students were essen- 
tially unschooled and naive in Rorschach doctrine and lore. The psychograms de- 
rived from these records were employed as measures of the personal adjustment of 
the eight examiners. 

A sample of the Rorschach records of each of the eight examiners’ work at the 
hospital was collected. The number of records per examiner varied from 15 to 21 
due to different lengths of service at the hospital, as well as to the variability in work 
load assumed by training personnel. The complete sample of patients tested was 
evaluated to determine whether any difference existed in age, IQ, or diagnosis for 
those patients seen by each of the eight examiners. The differences obtained by an 
analysis of variance of the age and IQ variables for the population of patients tested 
by each examiner were not significant. A chi-square test was made of the final 


1The writer wishes to acknowledge his gratitude to Drs. Paul Greenberg, Donald Johnson, and 
David Pearl for their helpful s tions concerning the manuscript. 
*The author is deeply indebted to Dr. A. I. Rabin and Dr. E. L. Kelly, who made the trainees’ 


personal Rorschach records available for this study. Wherever possible the anonymity of the trainees 
was respected. 
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diagnosis reached at the hospital for each testee in the sample seen by each examiner. 
Again, the demonstrated differences were insignificant. Accordingly it was con- 
cluded that no significant differences in age, IQ, or final psychiatric diagnosis existed 
among the sample of patients tested by each of the eight examiners. 

The Rorschach psychograms elicited by each of the eight trainee examiners 
were subjected to the following analysis: the median absolute score for each exam- 
iner on each of the 12 scoring variables was calculated. The examiners were ranked 
from one to eight according to the absolute magnitude of these median elicited scores. 
For each of the 12 scores the examiner who ‘“‘provoked” the largest median was 
assigned rank one. (Whenever it was necessary to assign ranks in other aspects of 
the treatment of data, rank one was uniformly reserved for the examiner with the 
largest score.) The examiners were similarly ranked on the median percentage 
(2hrolute score) of the various scores elicted. From these two sets of ranks for each 


examiner on each scoring variable, combined ranks were derived by adding the 
absolute and percentage ranks. The combined ranks were employed to reduce the 
spurious character of simple raw score ranks and percentage score ranks. In the case 
of R, F+%, and A%, combined ranks were not feasible. The final ranking for R 
was an absolute ranking; for F+ and A, a percentage ranking. The personal Ror- 
schach psychograms of the eight students were handled in an identical fashion. The 
absolute and percentage scores produced by each of the eight examiners were cal- 
culated. From this point on they were treated exactly as the elicited Rorschach 
scores so as to obtain absolute, percentage, and finally combined ranks. The R, 
F+%, and A% scores were limited again to simple ranks as in the case of the elicited 
Rorschach protocols. 


RESULTS 


In order to determine whether any significant difference existed between the 
eliciting tendencies of the eight examiners, a coefficient of concordance was com- 
puted. This analysis was calculated on the ranks of the eight examiners for the total 
of 12 elicited scores. Table 1 presents the data which yielded an insignificant co- 


TaBLe 1, Ranks FoR THE E.iciTED Scores or THE E1cHt EXAMINERS 








Examiners 


D 








ao nw OC] Pp 
o 


er TS Td OO FF F&F BF BIO 


oo3mW“77wnsworeoan oo 


























—_ 











EXAMINER INFLUENCE ON THE RORSCHACH 247 


efficient (W = 59; X? = 4.95 for seven degrees of freedom). No significant con- 
cordance could be disclosed between the ranks of the 12 scores for the eight exam- 
iners. Therefore, the initial hypothesis that significant overall differences existed 
in the 12 scores of the eight examiners was rejected. 

Our second approach to the question of examiner variance was to determine 
whether any relationship could be demonstrated between the examiner’s own per- 
formance on the test and his eliciting tendencies on the test. The rank of each ex- 
aminer on each of the 12 Rorschach scores of his personal Rorschach was compared 
with his rank on each of the 12 corresponding median scores of his elicited Rorschachs. 
The ranks for each score were obtained as described so as to derive nine combined 
ranks and three simple ranks. Rhos were computed to determine whether any re- 
lationship existed between each independent score on the eight examiners’ records 
and the identical median score elicited from patients by the eight examiners. A 
significant positive relationship was demonstrated for the popular (rho = +.86, 
p = .01) and white space (rho = +.80, p = .03) responses. This suggests that 
examiners somehow tend to transmit these qualities to their clients in the Rorschach 
test milieu. Individuals with high popular indices in their own personalities evoke 
corresponding!y high indices in their subjects. Conversely, those with low popular 
propensities elicit records with fewer popular responses. A similar relationship was 
manifest in the case of the white space response. The probability of obtaining two 
correlations out of 12 at this level of significance is remote. Also worthy of attention 
was the correlation —.54 obtained on the Y variable. This suggests a trend for high 
Y people to elicit fewer Y responses from the clients, and for low Y people to “pro- 
voke” more Y responses from their subjects. 


SUMMARY AND CONCLUSIONS 
Personality testing was for some years premised on the presumption that test 


results were untainted by the immediate and specific conditions under which the 
tests were administered. With the increasing awareness of the interpersonal equa- 
tion in social relations, the nature of the interaction between the examiner and 
patient has come under closer scrutiny. In the present study the effect of the Ror- 
schach examiner on the client’s response to the test was investigated. No gross dis- 
crepancies could be found among the total Rorschach configurations elicited by 
eight examiners. It was concluded that the test as a whole reflects only insignificant 
examiner influence. The personal Rorschach record of each of the eight examiners 
was also employed as a measure of the stimulus value of the examiner in the testing 
situation. Significant positive correlations were demonstrated between the popular 
response on the examiner’s personal record and the median popular response elicited 
in the role of examiner. A significant positive relationship was also found between 
the white space responses occurring on the examiner’s own record and the elicited 
records. A trend toward a negative relationship was disclosed on the Y score, such 
that high Y examiners tended to evoke fewer Y responses from their subjects. The 
results were taken as evidence of examiner variance, one aspect of the diverse in- 
fluences which impinge on the examinee during the testing situation. The results 
secured on the P and § variables might be explained by presuming that the auth- 
ority value of the examiner’s personality is transmitted to the testee, who reacts to 
it in a manner consistent with his effort to conform to the regimen of the treatment 
program at the hospital. It is conceivable that in many instances the patients were 
not avidly interested in obtaining assistance through the medium of psychological 
tests. In taking the tests many subjects were conforming to the demands of the 
hospital setting much as they abided by the general hospital regimen. It may be 
conjectured that the test situation was perceived as part of the hospital situation 
evoking a comparable tendency to conform. The implication of these results for the 
interpretation of the test in a hospital setting is to alert the clinician: (a) to the 
possible contamination of his patient’s test performance by his own personal adjust- 
ment; and, (b) to be mindful of the “‘institutional’’ test condition. 





DAVID BERGER 


REFERENCES 


Auprn, P. and Brenton, A. L. Relationship of sex of examiner to incidence of Rorschach res- 

nses with sexual content. J. proj. Tech., 1951, 15, 231-234. 

AUGHMAN, E. Rorschach scores as a function of examiner difference. J. proj. Tech., 1951, 15, 
243-249, 
Breck, Samvuet J. Rorschach’s test II. A variety of personality pictures. New York: Gruen & 
Stratton, 1945. 
Biever, M. Der Rorschach-Versuch als Unter scheidingsmitted von Konstitution und Prozess. 
Zischr. f. d. ges. Neurol. Psychiat. 1934, 151, 571-578. 
CaLpDEN, G. and Couen, L. B. The relationship of ego involvement and test definition to Ror- 
schach test performance. J. proj. Tech., 1953, 17, 300-311. 
Grssy, R. & The stability of certain Rorschach variables under conditions of experimentally 
induced set: I. The intellectual variables. J. proj. Tech., 1951, 15, 3-25. 
Gresy, R. G. Examiner influence on the Rorschach inquiry. J. consult. Psychol., 1952, 16, 
449-455. 
Hutt, M., Grepy, R. G., Mitton, O., and Porrnorst, K. The effect of varied experimental 
“sets’’ upon Rorschach test performance. J. proj. Tech., 1950, 14, 181-186. 
Ketty, E. L. and Fisxs, D. W. The prediction of performance in clinical psychology. Ann 
Arbor: University of Michigan Press, 1951. 
KENDALL, M. G. Rank correlation methods. London: Charles Griffin & Co., Limited, 1948. 
Lorp, Eprrx. Experimentally induced variations in Rorschach performance. Psychol. Monogr., 
1950, 64, No. 10 (Whole No. 316). \ 
Mosss, Lincotn E. Non parametric statistics for psychological research. Psychol. Bull., 1952, 
49, 122-143. 
SanpeErs, RicHarD and CLEVELAND, 8. E. The relationship between certain examiner personal- 
ity variables and subject’s Rorschach scores. J. proj. Tech., 1953, 17, 34-50. 
ScnHacuTe., E. G. Subjective definitions of the Rorschach test situation and their effect on 
test performance. Contributions to an understanding of Rorschach’s test. Psychiatry, 1945, 8, 
419-448. 
U. S. Army Air Forces Aviation Psychology Program Research Report. (J. P. Guilford Ed.) 
Printed classification tests. Report No. 5 Washington, D. C.: Government Printing Office, 1947. 
Weser, L. C. Ethics in administering the Rorschach test. J. abnorm. soc. Psychol., 1953, 48, 
443. 


no -_— 


“oe © = Ff fF FP PF 


_ — — — 
a ow to 





COMPARATIVE RELIABILITY AND VALIDITY OF THE HEALY 
COMPLETION TEST II AND A REVISED FORM! 


ERNA SCHWERIN MYLEN E. FITZWATER 


The Northwest Guidance Center, Lima, Ohio Bowling Green (Ohio) State University 


PROBLEM 


The purpose of this paper is (a) to compare the reliability of the Healy Pictorial 
Completion Test II and a revised form representing an adaptation to modern dress 
and to up-to-date pictorial environment; (b) to obtain estimates of validity, and (c) 
compare the performances of randomly selected school children on both tests. 
Hypotheses formulated were that the revised test will be more reliable and valid. 

Contemporary environmental situations and needs have been stressed by 
various writers ®: »- 44; 8, Pp. 3), as one of the most desirable attributes of a test, 
especially in connection with the use with children. The need for continual restand- 
ardization of existing tests has been pointed out by Watson °: ». °°), He states that 
the Healy Completion Test II suffers from lack of adequate standardization and 
suggests more basic and carefully planned studies with it. Although the normative 
group for this test received only very scant description, and although reliability and 
validity were not adequately established, the Healy Completion Test II is among 
the most widely used performance tests today as a component of the Arthur Point 


1The authors express their appreciation to Drs. Cecil M. Freeburne and Frank C. Arnold for their 
suggestions and criticisms, and to the Administrations of the Lima elementary and parochial school 
systems for their cooperation in providing the subjects used in this study. 
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Scale ‘ P- 8). While the latter was completely revised“, the Healy Completion 
Test II has remained unchanged since 1917. The studies which have been made 
with this test, while of considerable interest value, fall short of those systematic 
and more basic goals which the present study hopes to achieve": *. 


METHOD 


The group studied consisted of a random sample drawn from the population of 
seven hundred thirty-one eleven year old children in the elementary and parochial 
schools of Lima, Ohio, and consisted of seventy-eight girls and seventy-two boys. 
Grade placement was disregarded, so that a very small number attended third, a 
few attended fourth, and the majority of pupils attended fifth and sixth grades. 

Public school grades were used as a validating criterion for this study. Since 
the grading system of the Lima public schools does not utilize numbers or letter 
grades, it was necessary to substitute an equivalent numerical index replacing the 
teacher’s evaluations in the grade cards. From the rated items included in the 
progress reports, twenty-two were selected to represent the criterion measure. 
These items were: Spoken English: speaking clearly; speaking correctly. Written 
English: writing clearly; writing correctly. Vocabulary: ability to understand words; 
ability to use words. Reading skills: recognizing words; getting meaning from books; 
recalling the contents; using information read. Spelling correctly: words in lists; 
words in compositions. Citizenship in a Democracy: social studies (geography; his- 
tory; current events; science). Use of Reference Books: finding help from the dic- 
tionary; looking up data in encyclopedias; making wise choice and use of library 
books. Arithmetic: knowing number facts; being accurate; showing growth in ability 
to solve problems. 

The grades assigned for achievement in the above subjects were: excellent 
progress, satisfactory progress, shows improvement, needs improvement. These 
grades were assigned the numerical values of three, two, one, and zero. Since certain 
items in the above list of school subjects were found to have greater relevance in 
terms of reflecting scholastic achievement, they were more heavily weighted than 
others, viz., multiplied by two. These items are all those subsumed under the 
categories of Social Studies and Arithmetic. On the basis of these scoring criteria, 
the highest score obtainable was eighty-seven, and the lowest was zero. Since the 
parochial schools differed in their grading systems from those of the city schools, the 
pupils attending parochial schools were omitted from the validity comparisons to 
ensure uniformity of treatment of the statistical data. This resulted in the omission 
from the validity comparisons of twenty-two pupils. 


Test MATERIALS 


The standard equipment distributed by C. H. Stoelting Co. for the Healy 
Completion Test II was used in this study. A more detailed description of the mat- 
erials and of the test instructions can be found elsewhere“: »- 8°, 

The revised form of the Healy Completion Test II was designed by a student? 
of the Art Department of Bowling Green State University for the purpose of this 
study. Specific instructions were given to leave unchanged the situations of the 
pictures, and to preserve gestures and facial expressions of the individuals appearing 
in the pictures. The only changes which distinguish the revised form from the 
original test pertain to the modernization of clothing, of furniture, and of objects 
similar in function to those no longer in use, in a few of the sixty pieces used for in- 
sertion: e.g., instead of the ballon utilized in the old test, a blimp was drawn. The 
policeman in picture seven was given a modern uniform. The background color 
was made somewhat lighter than the original one, and thus meets Healy’s“: »- 2% 
specification of bright coloring more closely. In general, special care was taken to 
avoid introducing any variables into the revision which could not be accounted for 


2Miss LaVerne O. Romanchack. 
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and which might tend to distort the test results. The artist used water colors of ap- 
proximately the same shades as shown in the old test (except for the somewhat 
lighter background), and a colorless spray was applied to preserve the colors. 

The same scoring methods were applied to both tests, since the revised form 
utilizes the assumption, methods, and principles of the test as conceived by Healy. 
Raw scores were used for evaluation in this study. 


EXPERIMENTAL PROCEDURES 


The pupils were tested individually in the schools, usually in an empty class- 
room. The standard test instructions“: ». 5°!) were given in each case. The testing 
sequence was alternated for each testing session: beginning with the old test in one, 
and with the new one in another. Pupil cooperation was excellent throughout. 

Since the study was to cover three phases, viz., reliability and validity of both 
tests, and comparison between them, the group of one-hundred-fifty subjects was 
divided by closed random method of assignment into three groups of fifty pupils 
each. One of the groups of fifty subjects was randomly subdivided into two groups 
of twenty-five subjects each. The individual groups were tested in the following 
manner: 


Group A: old test—retest, 
Group B: new test—retest, 
Group Ci: old test—new test, 
Group C2: new test—old test. 


The counter-balanced order of testing (new test - old test, or old test - new test) was 
likewise randomly determined to achieve a balance of practice effects. A time inter- 
val of three weeks was employed between test administrations for all subjects. 


RESULTS 


Test-retest reliability coefficients were computed between the scores obtained 
in the first administration and the second administration of the old test, and be- 
tween those of the new test. To ascertain the extent of practice effects, the test of 
significance was used for comparison of the means of the distributions. A com- 
parison of the reliability coefficients for the old and the new tests yielded an ob- 
tained difference in Fisher’s z values of .376, and a oz of .225, which failed to be 
significant. Therefore, the old and the new tests do not differ significantly in their 
reliability, for the groups tested. 

Validity coefficients were obtained between grades and both the Healy Com- 
pletion Test II and the revised form (first administration scores). The sixty-five 
public school pupils given the original form first, and the sixty-three public school 
pupils given the revised form first, made up the validation groups. A test of sig- 
nificance of a difference between the coefficients of validity obtained for the old test 
and grades, and that obtained for the revised form and grades, was likewise applied. 
The obtained difference in z values was .493, and the oz was .181. This difference 
was significant beyond the 1% level, so that it was concluded that the revised form 
is more highly valid than the original test. The validity coefficient of .73 obtained 
in this study for the revised test is higher than those reported in the literature for 
studies in which school grades have been used as validating criteria“: ». **), The 
degree of agreement between the correlations found in this study indicates that, 
when grading is carefully done, the revised test closely taps abilities which come into 
play in scholastic achievement. The criterion measure was, then, a useful and real- 
istic instrument for the group tested, and the obtained coefficient of validity is a 
meaningful index, for its group, of the degree of relationship existing between the 
revised form and the criterion. No attempt has been made, within the scope of this 
study, to ascertain whether the old and the revised tests measure apperception, as 
stated by Healy“: »- '*-85), or whether it measures ability to educe relationships, as 
hypothesized by Watson: »- *). On the basis of the obtained validation data it 
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can be stated only that both tests are based on the same working conceptions as 
those playing an important part in school achievement. 

The fact that a significantly higher reliability of the revised form has not been 
demonstrated, does not detract from its greater usefulness in comparison with the 
old test. The ultimate index of such usefulness is a test’s validity, which has been 
obtained for the revised form, in connection with the groups tested. The results of 
the study are summarized in Table 1. 


TaBLe 1. Means, STANDARD DEVIATIONS, AND CORRELATION COEFFICIENTS OF ORIGINAL HEALY, 
REvIsED HEALY, AND GRADES 








Original Revised Means and Standard 
Healy Healy Deviations 





Test Retest 
Original r = +.78 M 52.9 57.8 
Healy 

SD = 18.5 11.8 
Revised ‘ M = 64.1 68.0 
Healy 

SD = 14.8 











Original 
Grades ‘ : M = 49.8 


(Validity) 
63 SD = 15.7 




















SUMMARY 


An estimate of reliability and of validity for the Healy Completion Test IT and 
a revised form, designed for this study and utilizing modern dress and current pic- 
torial environment, was obtained. A random sample of one-hundred-fifty school 
children (mean age 11.56) was obtained from the Lima (Ohio) public and parochial 
schools. They were divided at random into three groups of fifty subjects each, one 
group was tested twice with the original test, and one was tested twice with the re- 
vised test to establish reliability. The third group was tested with both the old and 
the revised form to obtain the intercorrelation between forms. A time interval of 
three weeks was used between tests for all groups. Public school grades were used as 
a validating criterion for both tests. No significant difference was found in the re- 
liability between both tests. The validity of the revised form was significantly 
higher than that of the old test when the total public school group was compared, 
and this appears to justify replacing the old form by a revision similar to the one 
used in this study. 


REFERENCES 


Fed bes Grace. A point scale of performance tests. Vol. I. (2nd Ed.) New York: Commonwealth 
un 
Buros, O. x (Ed.) The 1940 mental measurements yearbook. Highland Park, N. J.: 1941. 
CRONBACH, L. J. Essentials of psychological testing. New York: Harper’s, 1949. 
Goopvenovues, F. L., and Maurer, K. M. The mental growth of children from two to fourteen 
ears. Minneapolis: University of Minnesota Press, 1942. 
ARROWER, MO.uie (Ed.) Recent advances in diagnostic psychological testing. Springfield, Il.: 
Chas. C. Thomas, 1950. 
Hexaty, Wo., et al. A manual of individual mental tests and testing. Boston: Little, Brown & Co., 
1928. 
Heaty, Wo. Pictorial Completion Test IT. J. Appl. Psychol., =. 5, 225-39. 
MorsEL1, J. L. Psychological testing. New York: Longmans, G reen & Co., 1947. 
Watson, R.I. The clinical method in p sychology. New York: Harper’ 8, 1951. 
WECHSLER, D. The measurement of adult intelligence. Baltimore: Williams Wilkins Co., 1944. 
WERNER, H. A comparative study of a small group of clinical tests. J. Appl. Psychol., 1940, 23, 
231-36. 


— 
. 


Seer FS PPPS 


a 





SOME EFFECTS OF ALCOHOL ON RORSCHACH PERFORMANCE 
ALBERT RABIN, NED PAPANIA! AND ALLAN MC MICHAEL? . 
Michigan State College 


PROBLEM 


In a recent review of the literature Jellinek “? states that ‘‘The most important 
conclusion that may be drawn from psychological experiments with alcohol... is... 
that alcohol is a depressant, not a stimulant. It affects first the higher brain centers 
which control the voluntary behaviors and emotions.”” The present study is de- 
signed to investigate the “‘depressant”’ effects of alcohol upon the relevant factors 
involved in Rorschach performance. More specifically, we wish to test several 
“working hypotheses” that are particularly related to Rorschach performance. 

Consistent with the notion of the “depressant” effect, the following changes in 
the Rorschach profile may be expected when subjects are under the influence of 
alcohol: 

1. A decrease in the total productivity (R) 
2. An increase in reaction time (T/1R) 


3. Greater constriction in the “experience balance” (Erlebnistypus), i.e. Introversion extra- 
tension ratio. 


4. Greater constriction as expressed in increased F and A percentages. 


In addition to these hypotheses, others, consistent with our findings on the effects 
of alcohol on handwriting ® are to be tested. In the study on handwriting it was 
found that the subjects “took more space and were less accurate in copying a para- 
graph”. Less attention to detail was also noted. Consequently, we shall also hypothe- 
size the same tendencies in the relevant Rorschach perceptual tasks: 
5. eo W response trend will increase and the Dd trend will decrease under the influence of 
cohol. 
6. seeety of perception (percentage of good form: F) will decrease as a result of the ingested 
cohol 


SUBJECTS AND PROCEDURE 


The experimental population consisted of 53 normal subjects of high educational 
level, between the ages of 22 and 40, who volunteered for the experiment.’ All but 
two of the subjects were moderate social drinkers. A standard brand of 100 proof 
bourbon was given to the subjects in straight or mixed (water or soda) form, as fast 
as they were able to drink it. During the 4% hour evening session each subject con- 
sumed between 9 and 15 ounces of whiskey. The blood alcohol levels ranged from 
.056 to .220 percent. The drinking was done in a social setting—three or four sub- 
jects at a time. They either played some table games, chatted among themselves or 
with the experimenters present. 

Following a brief handwriting test, the Rorschach was administered, shortly 
after 7:30 P. M. and before drinking was begun. The test was readministered about 
3% hours later, after the doses of aleohol mentioned above had been ingested. 

In an attempt to control for practice effects and to avoid the excessive effects 
of recent memory upon the responses, two forms of the Rorschach were employed 
with some of the subjects—the original Rorschach test and the Behn-Rorschach, a 
parallel form described by Zulliger“*. Eighteen subjects were given the Rorschach 
first and reexamined with the same test. Eight subjects were tested and retested 
with the Behn. Fifteen were given the Rorschach first, and then the Behn; and 


twelve, the Behn first, and then the Rorschach. The scoring followed Beck’s®? 
method. 


1Now at Wayne County Training School. 

*Now at the U. S. Naval Academy. 

*The psychological Sevontinetions were conducted at the invitation of Prof. Ralph Turner, De- 
patent of Police Administration, Michigan State College, whose work was financed - the National 


fety Council. Other participants in the psychological studies were: Dr. Alvis Caliman and Dr. 
Harry Blair. 
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t 
RESULTS 


On the surface, our first hypothesis seems to be only partially borne out. Table 
1 presents the mean number of responses under each of the specified conditions. It is 
quite evident that when the same instrument is used in the reexamination under 
alcohol, no significant changes in productivity are noted. However, when a different 
form of the Rorschach is used and the task is somewhat novel, because of the differ- 


TABLE 1, CHANGES IN Propuctivity (R) (MEANS oF RESPONSES PER RECORD) 








Before During Significance of 
Test Alcohol Retest Alcohol Difference 





Rorschach 29.1 Rorschach 29.4 
Rorschach 37.5 Behn 22.6 2% level 
Behn 41.1 Rorschach 22.9 1% level 
Behn 27.6 Behn 28.3 

















ence in the stimulus cards, a marked and statistically significant decrease in the level 
of productivity appears. Readministration of the same test after a comparatively 
short time interval may bring about a contamination of the results because of the 
practice effect. Hence, it may be suggested that if the task introduced while sub- 
jects are under the influence of alcohol is similar but not identical to one presented 
shortly before drinking, considerable reduction in productivity takes place, pre- 
sumably because of the “‘depressant”’ effects of the drug. 

Results according to expectation are not clearly obtained with the reaction 
time (T/1R) to the cards. As may be noted in Table 2, the two instances in which 
the same test was used even show a reduction (practice effect?) upon retest. In the 
two groups in which different test forms were used the expected increase in reaction 
time does not occur either. There is an increase in one instance and a decrease in 
the other. 

Since the differences reported in Table 2 are not statistically significant, hy- 
pothesis No. 2 must be rejected. However, the predominant trend, i.e. reduction 
in first response time, may be noted. Moreover, the trend is in a direction opposite 
to that hypothesized. 


TABLE 2. MEANS OF First REACTION TIMES IN SECONDS 








Examination Re-examination 


Rorschach-Rorschach 20.0 14.9 
Rorschach-Behn 15.8 13.8 
Behn-Rorschach 26.7 30.4 
Behn-Behn 18.6 14.6 








The “Experience Balance” in all 53 cases (for a total of 106 Rorschach records) 
was classified according to a fourfold conventional classification: 
Extratensive, where sum of color (C) is two or more greater than movement (M). 
Introversive, the converse of the extratensive ratio. 
Ambiequal, where both M and C are greater than two and the difference between them is 
not more than one. 
Constricted, when either M and C or both are two or less. 


Table 3 shows that there are no marked changes from test (before alcohol) to retest 
(during alcohol). Differences in the incidence of the several categories of Exper- 
ience Balance are not statistically significant. A more detailed examination of the 
data also shows that in 29 cases there was exactly the same type balance upon retest 





ALBERT VABIN, NED PAPANIA ANT ALLAN MC MICHAEL 


TaBLE 3. THE EXPERIENCE BALANCE 








N = 53 Extratensive | Introversive | Ambiequal | Constricted 





Test 9 17 
Retest 12 13 








and in the remainder, changes in classification took place. However, the shifts oc- 
curred in all directions, without indicating a particular trend. Certainly, we cannot 
justify a conclusion of greater constriction of the Experience Balance as a result of 
the influence of alcohol. 

A summary of the changes occurring in several Rorschach factors, relevant to 
our hypotheses, appears in Table 4. The application of the Chi square test of sig- 
nificance ®: 4) results in levels of confidence between one and five percent. It is quite 
clear that the number of cases in which F and A percent show an increase under re- 
test conditions is highly significant. Thus, support for hypothesis No. 4 is given. 
Both factors are considered as indices of constriction and stereotypy. 


Tasie 4. Cases SHow1nG CHANGES IN SEVERAL PERTINENT Factors 








Level of 
Factor Increase Decrease No Change | Confidence 


F% 27 16 10 2% 
A% 13 10 1% 
W% 14 13 5% 
Dd% 27 17 1% 
F+%* 14 2 1% 




















*These data are based on 18 cases only who were reexamined with the Rors- 
chach on which F+ norms are available. 


The remaining three factors and the changes reported support the remaining 
two hypotheses stated above. The tendency to use larger areas in perception (W 
increase) and to neglect small and rare detail (Dd) is clearly demonstrated. 

Finally, the reduction in accuracy of form level (F+ percent), though reported 
on a smaller number of cases, because of the lack of F+ norms for the Behn-Rors- 
chach test, confirms our expectations as stated in the last hypothesis. 

It may be noted parenthetically, as was true in the earlier report concerning 
handwriting “, that no differences in the variables were related to alcohol blood level. 


SUMMARY AND CONCLUSIONS 


The purpose of the present study was to investigate the alleged “‘depressant”’ 
effects of alcohol upon Rorschach performance. Our expectations were based on 
Rorschach theory and on results obtained with the effects of alcohol on hand- 
writing, that after alcohol is ingested in sizable amounts (.056 to .220 percent con- 
centration in blood) some specific changes would take place. It was hypothesized 
that R, Dd and F+ percent will decrease and that T/1R, W and F and A percentages 
will increase. Greater constriction in the M:C balance was also expected. 

A total of 53 subjects were reexamined 3% hours after their original Rorschach 
with the same or an alternate form (Behn-Rorschach). During this interval, sizable 
amounts of alcohol were ingested orally by the subjects. Group results and in- 
a of changes in the relevant Rorschach variables upon reexamination are re- 
ported. 
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In general, it may be concluded that alcohol depresses productivity, reduces 
accuracy of perception, decreases attention to details and permits the individual a 
less critical and self-controlling attitude. 
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RIGIDITY AND FLEXIBILITY ON THE RORSCHACH! 
BENJAMIN FABRIKANT? 
Veterans Administration Hospital, Buffalo, N. Y. 


PROBLEM 


One of the uses of the Rorschach test is to evaluate the extent of the rigidity 
and flexibility of an individual’s defense system and his approach to life’s problems. 
In a recent experiment on the effects of a verbal set on Rorschach test performance 
of neurotics®), the writer had the opportunity to evaluate the use of certain Ror- 
schach variables as indicators of rigidity and flexibility. 


PROCEDURE 


Two equated groups of male, psychoneurotic veterans were used. The Ror- 
schach test was administered twice, two weeks apart, to each of the subjects in the 
two groups. The subjects in Group A received the same instructions each time, 
those in Group B received new instructions prior to the second administration. 
These instructions were so structured as to maximize changes in the frequency of 
responses in the movement, color, shading, and texture response categories. 

The initial and repeat test records of both groups were inspected for the exist- 
ence of changes on the retest in the frequency of responses in the above mentioned 
four categories. The writer found that only three records in Group A showed changes 
in at least three of the four response categories, while 15 records of Group B showed 
changes in at least three categories. Therefore, for the purposes of this study, only 
Group B records were analyzed. 

Cowen and Thompson), and Hertz“) found that the more rigid individuals 
were characterized by the presence of certain factors on the Rorschach. The factors 
common to both studies, and the cutting scores used most frequently in the avail- 
able literature to indicate the more rigid individuals are: 


1. P% over 30 

2. A% over 50 

3. D% over 60 

4. F+% over 95 

5. Sum C greater than Sum M by at least 2:1 


Two methods of analysis were used. In the first, the initial test records of Group 
B were divided into two sub-groups. The 15 records mentioned above were put into 
the sub-group having shown changes (HSC), and the remaining 17 records into the 
sub-group not having shown changes (NHSC). X? was the statistic used to compare 
the differences in the presence of these five factors in both sub-groups. 


1From the Veterans Administration ional Office, Buffalo, New York. 
*Chief, Psychological Research Unit, Veterans Administration Hospital, Buffalo, New York. 
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For the second analysis, the initial Rorschach test records of Group B were 
again inspected and 18 records, those that had at least four of the five factors, were 
put into the sub-group predicted not to show changes (PNSC) in the selected re- 
sponse categories on the retest. The remaining 14 records, those that had three or 
fewer of the five factors in each record, were put into the sub-group predicted to 
show changes (PSC) on the retest. While there was some overlap, all of the records 
in the HSC sub-group were not the same records that were in the PSC sub-group. 


RESULTS 


In the first analysis, the comparison of the initial test records between the 
HSC and NHSC sub-groups, the obtained X? of 1.92 (P = .70) indicates that there 
was no significant difference in the presence of the five factors between the two 
sub-groups. 

For the second analysis, the writer predicted that there would be significant 
differences in the mean frequency of the selected response categories from the initial 
to the repeat Rorschach records of the PSC sub-group. The data are shown in 
Table 1. The writer concluded that, according to Wilkinson’s tables“), the one 
significant difference in the mean frequency of the total movement response category 
- six comparisons was insufficient to enable him to accept the prediction as confirm- 
ed. 


TABLE 1. ¢ Tests FoR SIGNIFICANCE OF DIFFERENCES BETWEEN MEANS OF 
PSC Sus-Grovp INITIAL AND REPEAT TEST FREQUENCY OF RESPONSES 








Mean Scores 
Response Category Initial Repeat 





total movement 5.9 8.6 
responses 20.2 24.0 
F+% 88.6 91.8 
total color 3.9 4.7 
total shading 9 1.2 
total texture 1 | 1.6 

















As a corollary, the investigator predicted that there would be no significant 
differences between the initial and repeat mean frequency of responses in the records 
of the PNSC sub-group. The results, as shown in Table 2, indicate that this pre- 
diction was confirmed. 


TABLE 2. ¢ Tests FoR SIGNIFICANCE OF DIFFERENCES BETWEEN MBANS OF 
PNSC Sus-Grovp INITIAL AND REPEAT TEST FREQUENCY OF RESPONSES 








Mean Scores 
Response Category Initial Repeat 





total movement 6.9 7.6 
responses 16.8 18.2 
F+% 90.7 94.5 
total color 1.5 1.8 
total shading 3 2 
total texture i 8 
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The conclusion drawn from the data in Tables 1 and 2 is that the presence of 
at least four of the five factors may be used to indicate that these individuals are 
rigid, but the absence of these same five factors in a record may not be used to in- 
dicate that the individual is not rigid, i.e., flexible. These results are in accord with 
those published by Cowen and Thompson. 

In their intensive study on rigidity and the Rorschach, Cowen and Thompson“? 
tested twenty Rorschach factors that are used in clinical practice to differentiate 
rigid from non-rigid individuals. Their results show that there were significant 
differences between rigid and non-rigid groups in the mean scores of the following 
eight Rorschach factors. 

1. R (umber of responses) 
2. C (total number of color responses) 
M-+C (human movement plus total number of color responses) 
CR (content range) 
Rej. (rejections) 
T/R (time per response) 
T/R:, (time for initial response to Card I) 
F+% dev. (deviations from an optimal F + of 80-90%) 


The subjects in the HSC sub-group, on the basis that their records showed 
changes from the initial to the repeat test, were predicted to be less rigid, as a group, 
than the subjects in the NHSC sub-group. The results are presented in Table 3. 

TasBLeE 3. ¢ Tests FoR SIGNIFICANCE OF DIFFERENCES BETWEEN MEANS OF 


HSC anp NHSC Svus-Grovups’ Inrt1au Test Scores ON COWEN AND 
Tuompson Rigipiry Factors 








Mean Scores 
Factors HSC NHSC P 





R 20.2 16.8 10 
Cc 3.9 2.7 01 
5.1 3.2 : .03 
CR 5.4 3.1 01 
Rej. 5 4 45 35 
T/R 1.00 20 
T/Ri : .92 .20 
F+% dev. ; F itt .45 

















Wilkinson’s tables“ assign a P = .0003 for obtaining three significant differ- 
ences in eight comparisons. The hypothesis that the HSC sub-group is less rigid 
than the NHSC sub-group is confirmed. 


SUMMARY 


The purpose of the present study was to investigate the use of selected Ror- 
schach variables as a means of evaluating rigidity and flexibility as interpreted from 
Rorschach records. Two groups of psychoneurotic veterans received two administra- 
tions of the Rorschach at a two week interval. Group A received the same instruc- 
tions each time while Group B received altered instructions prior to the second 
administration. Group B was then divided into two sets of sub-groups. The first 
division, based on observed changes in the frequency of the movement, color, tex- 
ture, and shading response categories, was into the sub-group having shown changes 
(HSC), and the sub-group not having shown changes (NHSC). The second division, 
based on the presence or absence of five selected Rorschach variables in the initial 
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test records, was into the sub-group predicted to show changes (PSC), and the sub- 
group predicted not to show changes (PNSC). Other Rorschach variables were also 
investigated. The results of the present study are summarized as follows: 


1. There were no significant differences in the mean number of the five Ror- 
schach variables (P%, A%, D%, F+%, M:Sum C ratio) present in the initial test 
records between the HSC and NHSC sub-groups. 

2. The investigator was unable to predict, on the basis of the presence or 
absence of the above five factors, the records of those in Group B which would show 
changes after the altered verbal instructions. 


3. Significant differences were found in mean scores of the Cowen and Thom 
son rigidity factors in the initial test records between the HSC and NHSC sub- 
groups. 


The writer concludes that: 


1. The Rorschach factors used most frequently in the literature to evaluate 
rigidity and flexibility in an individual’s personality structure are ineffective for this 
purpose. However, other Rorschach factors are available for use in differentiating 
between rigid and flexible groups. 


2. Further research with the new factors is needed to establish specific cutting 
scores and patterns for use with individual records. 
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EVALUATION OF SELECTED SHORT FORMS OF THE WECHSLER 
INTELLIGENCE SCALE FOR CHILDREN (WISC)! 


FREDERICK O. CARLETON AND CHALMERS L. STACEY 


Syracuse University 


PROBLEM 


Numerous reports of research relating to the evaluation and application of 
short forms of the Wechsler-Bellevue Intelligence Scale have appeared in the 
psychological literature. Much of the relevant literature has been summarized by 
Herring®>. However, to the authors’ knowledge, attention has not been directed to 
the feasibility of utilizing short forms of the Wechsler Intelligence Scale for Children 
(WISC). It would seem that arguments which have been presented in support of 


the applicability of short forms of the W-B would be equally cogent in respect to 
the WISC. 


The purpose of the present paper is to report obtained correlations between 
each of twenty-one short form combinations of subtests of the WISC and the full 
weighted score, for a sample of 365 children referred to the Syracuse State School for 
evaluation. The short form combinations reported are comparable in subtest con- 
tent to those reported by Herring, although the level of difficulty of the subtests 


The authors wish to express their appreciation to Dr. 8. W. Bisgrove, Senior Director, Syracuse 
State School, for his cooperation. 
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differ. Abbreviations used in this paper are as follows: Information (I), Compre- 
hension (C), Arithmetic (A), Digit Span (Dsp), Similarities (S), Vocabulary (V) 
Picture Arrangement (PA), Picture Completion (PC), Block Design (BD), an 
Coding (Cod). For this study Cod refers only to Coding B. No subjects were chosen 
for whom Coding A had been administered. 


SUBJECTS 


Since 1950, the WISC has been routinely administered to all children of the 
appropriate age levels who have been referred to the Syracuse State School as 
possible mental defectives for whom institutional placement might be desirable. All 
tests were administered during a sixty day observation period by qualified members 
of the State School staff. At the conclusion of the observation period, reeommenda- 
tions were made relative to the desirability of institutional placement. Some of the 
subjects included in the sample described below were later placed in state institu- 
tions; others were not. However, the sample is considered to be typical of the popu- 
lation of children who have come to the attention of social agencies as possible mental 
defectives, and were accordingly referred to State Schools for observation and eval- 
uation. 


TaBLeE 1. Mean, STANDARD DEVIATION AND RANGE IN CHRONOLOGICAL AGB, 
WISC Tora WEIGHTED Scorz anp WISC Foutt Scatz IQ For 365 SussEcts 
REFERRED AS PossIBLE MENTAL DEFECTIVES 








| | 
Factors | Range Mean 8.D. 


Chronological Age (Yrs.) | 7-83-16.17 12.25 1.83 
Total Weighted Score | 27-105 66.02 16.75 








Full Seale IQ | 46-91 | 67.82 9.40 





The present sample was secured by extracting from the school’s file the WISC 
record blank of all subjects referred to the Syracuse State School during the past 
three years. Excluded from the study were those subjects for whom any of the 
WISC subtests were missing or for whom there was any suspicion of organic in- 
volvement as inferred from medical and family history. Table 1 presents descriptive 
statistics concerning general intelligence and chronological age of the 365 subjects 
included in the present sample. From this table it can be seen that the subjects 
ranged in full scale IQ from 46 to 91, thus including not only low grade and border- 
line defectives but some dull normals. The range in chronological age was from 7 
years 10 months to 16 years 2 months. 


RESULTS 


Table 2 presents the results for the twenty-one combinations of subtests as 
short forms of the WISC. Short form scores were obtained by summing the weighted 
scores of the individual subtests comprising each of the short forms. Pearson 
product-moment correlations were obtained between each short form score and the 
sum of the weighted scores for all subtests combined. Correlation coefficients for 
combinations of two subtests range from .64 to .80; for three subtests from .73 to 
.84; for four subtests from .82 to .88 and .88 for five subtests. 

As the subjects comprising the present sample were selected from suspected 
mental defectives referred to a State School as potential institutional cases, it is 
judged that the obtained correlations are underestimates of what would have been 
obtained with children selected from the general population. Despite this bias due 
to restriction of range, the magnitude of the obtained relationships would seem 
sufficiently strong to warrant optimism concerning the feasibility of utilizing short 
forms for preliminary screening of groups of children. When contrasted with the 
results of Cotzin“) who used a sample of “high grade” or “borderline’’ defectives, a 
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TABLE 2. CORRELATIONS BETWEEN FuLu ScALE AND SHortT Forms OF THE WISC ror 365 Sussects 
REFERRED AS MENTAL DerectiveEs (S. E., = +.05) 








Short Form Combinations Correlation Short Form Combination Correlation 
V-PA .80 I-PC-PA-Cod. .88 
S-PA .78 C-A-BD-Cod. . 86 
Dsp-PA 77 C-S-Dsp-BD . 86 
A-V .73 C-V-BD- PC .85 
I-BD 72 I-BD-S-V 
C-A .70 
V-Dsp. .68 C-A-S-Dsp-PA 
C-V 64 C-A-BD-Cod.-PC 
C-A-PA .84 
Dsp-PA-S .83 
C-BD-Cod. .82 
C-A-S ote 
C-V-Cod. 76 
C-V-S .73 

















sample somewhat comparable in IQ to the present study, for investigating short 


forms of the W-B, the present results would seem at least as favorable as those 
obtained with the W-B. For short forms C-A, Dsp-PA, C-A-S, and C-S-Dsp-BD, 
Cotzin’s obtained correlations were .532, .746, .588 and .835 respectively while those 
of the present study were .70, .77, .77, and .86. However, the present sample may 
not only be more heterogeneous in respect to general intellectual ability of the sub- 
jects selected, but the WISC may also lend itself more readily to revealing individual 
differences among subjects of low mental ages than would the W-B which was de- 
signed for a higher level of mental development. 

On the other hand, when compared with studies including subjects from a more 
general (or normal) and heterogeneous population; e.g. McNemar“), Hunt®, 
Patterson © ©, and Herring ®?, the present correlations are consistently lower than 
those found with the W-B. Future studies utilizing a more heterogeneous sample 
should reveal whether these trends are due to the selection of subjects or to the re- 
lationships existing among the subtests of the two instruments. 


SUMMARY 


Obtained correlations between full weighted scores of the WISC and selected 
short form combinations are presented for a sample of 365 mental defectives and 
dull normals referred to the Syracuse State School by social agencies for observation. 
The obtained correlations range from .64 for a two subtest combination to .88 for a 
five subtest combination. In contrasting these correlations with those obtained 
using similar short forms of the W-B, the trend is for at least as high a relationship 
between short forms and full scale using the WISC when samples of restricted range 
of intellectual development are employed. When normal or more general samples 
are used, correlations obtained with the W-B are consistently of greater magnitude 
than those reported in the present study. These trends may be due either to selection 
of subjects or to the nature of the interrelationships existing among the respective 
subtests for the two instruments. 
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RELATIONSHIPS BETWEEN THE WECHSLER-BELLEVUE FORM I 
AND THE WISC'! 
IRWIN J. KNOPF, BETTY J. MURFETT, AND VICTOR MILSTEIN 
State University of Towa?* 


PROBLEM 


Quite in keeping with the growing popularity of the Wechsler Intelligence Scale 
for Children (WISC) is the increasing number of investigations designed to compare 
this new test with a variety of older and more established intelligence measures. 
For the most part, these studies have been restricted to IQ comparisons “: * 4, 6, 
Relatively few studies, however, have attempted to determine the relationships 


between the WISC and the Wechsler-Bellevue Form I (WB). While both tests have 
been standardized and recommended for use with overlapping age groups (10 to 16 
years), little is known of the comparative test performance within this age range. 
Moreover, the obvious similarities in the form of both tests may suggest face validity 
for the possible but untested assumption that the various subtests have equivalent 
significance on the two scales, and that clinical interpretations made for the adult 
scale would similarly hold for the children’s test. 

The purpose of the present investigation, therefore, was to compare IQ scores 
for the two instruments, as well as to determine the degrees of relationship from test 
to test for each subtest, and between the test profiles for each subject. 


PROCEDURE 


Previous studies °: 7) have included relatively heterogeneous age groups and 
both sexes in their subject populations, although both age and sex differences were 
reported in the standardization data for the WISC“. In order to restrict the in- 
fluence of chronological age and sex, and to make it possible to pursue the effects of 
these variables at a later date, our subject population was restricted to thirty boys 
selected from the Iowa City Junior High School.’ Their ages ranged from 13 years 4 
months to 14 years 6 months, with a mean age of 13 years 9 months. As independ- 
ently measured by the Otis, their 1Q’S extended from 79 to 116, with a mean of 
99.73. These scores were distributed with the following categories of intelligence: 
dull normal, 5; average, 21; and bright normal, 4. 

The WB and the WISC were administered to all subjects with the order of the 
tests counterbalanced so that half of the subjects received the WB followed by the 
WISC, and half of the subjects received the WISC followed by the WB. Both tests 


1Presented at the annual APA meetings at Cleveland, September, 1953. 

2From the Psychopathic Hospital and the Department of Psychiatry of the College of Medicine. 

’Thanks are due to Mr. Carl Miles, Principal of the Iowa City Junior High School, and to our 
subjects for giving generously of their time and energy in order to make this study possible. 
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were administered at one sitting with a brief rest period of approximately 5 minutes 
given between tests. Testing was divided among the three authors so that each ad- 
ministered and scored a total of 10 WBs and 10 WISCs for 10 subjects. As an addi- 
tional precaution, all of the test records were independently re-scored by the other 
examiners in order to obtain complete scoring agreement. 


RESULTS 


The mean IQ scores and their standard deviations for the three scales of the 
WB and the WISC are presented in Table 1. From this it will be noted that the 


TABLE 1. MEANS AND STANDARD DEVIATIONS OBTAINED ON THE WB AND 
THE WISC For 30 Sussects 








Score WB WISC 


M SD M SD 


Verbal IQ 98.87 11.91 | 102.80 10.33 
Perf. IQ 103.43 7.84 | 104.70 10.64 
Full IQ 100.63 10.08 | 104.03 9.98 











mean IQ’s on the WISC were higher than the WB on each scale. The mean differ- 
ences between the scores on both the Verbal and Full Scales were statistically signifi- 
cant at a degree of confidence beyond the .01 level, while the mean difference be- 
tween the Performance IQ’s was not reliably different. These findings are consistent 
with the results of Vanderhost, Sloan, and Bensberg“?, who compared WISC and 
WB IQ’s with high grade mental defectives and found significantly higher Verbal 
1Q’s on the WISC, while no difference was observed between Performance IQ scores. 

Product moment correlation coefficients were computed between the IQ’s of each 
scale for the two instruments. The obtained correlations were: .83 between Verbal, 
.64 between Performance, and .83 between Full Scale IQ scores. With the exception 
of the correlation for the Performance IQ’s these results are similar to those of De- 
lattre and Cole® who report correlations of .86 for Verbal, .82 for Performance, and 
.87 for Full Scale IQs. 

Table 2 shows the correlations between weighted scores of similar subtests for 
the two instruments. The correlations for Picture Arrangement and Object Assembly 
are noticeably low and do not differ significantly from zero. This suggests that 
the same factors are not measured by this subtest on the WB and the WISC, despite 
the obvious similarities in the form of the subtests. The r for Digit Symbol was sta- 
tistically reliable at the .05 level of confidence, while all of the remaining subtests 
yielded correlations which were significantly different from zero at the .01 level. 
Although the correlations cited indicate significant positive relationships between 


TaBLE 2. CoRRELATIONS BETWEEN SmmiLaR WB 
AND WISC Sustests ror 30 Sussects 








Subtest r 





Information 

Comprehension .54** 
Digit Span 3 
Arithmetic .63** 
Similarities : 
Vocabulary .83** 
Picture Arrangement 22 
Picture Completion .63** 
Block Design .66** 
Object Assembly .O1 
Digit Symbol 


*Significant at the .05 level 
**Significant at the .01 level 
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nine of the eleven subtests for both instruments, the size of the obtained correlations 
is, in general, not large enough to afford prediction for individual cases without 
introducing large sources of error. 

In order to determine the degree of relationship between the test profiles, the 
performance of each subject on both tests was correlated. This procedure yielded a 
correlation for each subject, and thus, a total of 30 correlations. Each r was then 
transformed into a z score for the purpose of deriving an average correlation for the 
entire subject population. A Chi-square test of homogeneity was applied to this 
distribution and it was found that the correlations were homogeneous. The cor- 
relations ranged from —.36 to .74, and with 9 degrees of freedom only 7 of these were 
found to be significantly different from zero at either the .05 or the .01 level of con- 
fidence. It should be pointed out that, in a distribution of 30 correlations, between 
one and two could by chance be significant at the .05 level. Thus it is apparent that 
only a few of our subjects produced test profiles that showed significant positive 
relationships. In addition the average correlation of .43 is not significant and lends 
support to the conclusion that clinical interpretations made on the basis of one test 
profile (WB) do not, in most cases, hold for test profiles obtained from the other 
instrument (WISC). Wechsler’s early note of caution in this regard becomes even 
more pointed in light of the findings reported here “?. 


SUMMARY 


This study was designed to compare IQ’s on the WB and the WISC, and to 
determine inter— and intra-test relationships with a group of 30 adolescent males. 
The tests were administered to all subjects in one sitting in a counter-balanced order. 
The results obtained were as follows: 

1. Verbal and Full Scale IQ scores were significantly higher on the WISC, 
while the Performance Scale IQ scores were not reliably different. 


2. Correlations of .83 between the Verbal Scale, .64 between the Performance 
Scale, and .83 between the Full Scale IQ’s were obtained. 


3. Significant positive correlations were obtained between the WB and the 
WISC for 9 subtests, while the correlations for the Picture Arrangement and Object 
Assembly subtests were not significantly different from zero. 


4. Seven out of a total of 30 correlations between the test profiles for each sub- 
ject showed significant positive relationships. The average r for the total distribu- 
tion of correlations was not statistically significant. 


On the basis of the above results, it should be noted that at least within the 
confines of the subject population sampled here, clinical interpretations derived from 
the test profile of the WB should not be expected to similarly hold for the test profile 
of the WISC, and vice versa. In addition, it would seem contraindicated to attempt 
to predict Picture Arrangement and Object Assembly subtest scores from one test 
to the other. 
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DIVERGENT SCORES ON THE WECHSLER-BELLEVUE SCALE AS 
INDICATORS OF LEARNING ABILITY! 


ORISON S. MC LEAN? 
University of Kentucky 


PROBLEM 


The Wechsler-Bellevue Intelligence Scale“, although designed as a measure of 
general intelligence, has been used extensively as an aid to psychiatric diagnosis. 
Reviews of the literature“: * *) indicate three main analyses of intelligence test 
scores: (1) comparison of the verbal and performance IQ’s, (2) analysis of the sub- 
test scores for the amount of variation, and (3) analysis of the pattern or profile of 
the subtests. Because of the frequent use of the Scale in clinical practice, and since 
conflicting results concerning its validity as a diagnostic aid are reported, the problem 
warranted further investigation from a different approach. The present study is 
unique in relating Wechsler-Bellevue scores to learning tasks which can be objective- 
ly evaluated in a controlled setting. 

This investigation was designed to test the validity of certain inferences drawn 
from clinical interpretations and statements by Wechsler and others concerning the 
diagnostic use of the Bellevue Scale. Two main hypotheses were proposed: (1) 
verbal and performance IQ’s are diagnostic indicators of learning ability in the two 
areas, and (2) subtest variation is indicative of learning ability. 


METHOD 
The subjects of the study consisted of 78 neuropsychiatric patients at the Lex- 
ington Veterans Administration Hospital. In order to test the first hypothesis, the 
hospital subjects were divided into three groups which were similar in age, education, 
and full scale 1Q but different in verbal and performance IQ’s. Twenty-five subjects 


with verbal IQ’s at least ten points higher than their performance IQ's constituted 
the Verbal Group. The Performance Group consisted of 25 subjects with perform- 
ance 1Q’s at least ten points higher than verbal IQ’s. The Equal Group contained 
24 subjects whose verbal and performance IQ’s did not differ by more than three 
points. 

Learning was defined as a change in the performance of a subject as a result of 
practice. For this study, change in proficiency was evidenced by an accumulation of 
units learned, a decrease in errors, and /or a decrease in time required to accomplish 
the task. The amount of practice was limited by a criterion established for each 
learning task. Verbal and nonverbal activities were selected. In keeping with 
Wechsler’s definitions of verbal and performance abilities, the verbal activities in- 
volved words and numbers and the nonverbal ones included the manipulation of 
objects and the perception of visual patterns. 

Five tasks were included in the battery of learning activities. Mirror tracing 
was selected as a practice or ‘‘warm-up” task. Paired associates and verbal reasoning 
were designated as verbal learning activities, while a formboard and assembly tasks 
were selected as nonverbal. The differences between the verbal and nonverbal learn- 
ing tasks were relative; the activities differed to the degree that the verbal materials 
were words that stand for objects, while the performance tasks were specific, con- 
crete, and functional objects. Although abstract and conceptual thinking were in- 
volved in both types of learning, the one involved thinking about ideas and the other 
involved thinking about objects. In addition to a difference in content, the learning 
tasks differed in the manner of presentation and the nature of practice. The in- 
structions for learning the verbal tasks were oral and written, while the directions 
for the nonverbal ones were presented through demonstration. The verbal tasks em- 
ployed recitation and the nonverbal ones involved manual manipulation. 

1This study was conducted under the direction of Dr. Robert E. Bills, University of Kentucky in 


partial fulfillment of the requirement for the degree of Doctor of age om 6 
2From the Veterans Administration Hospital at Clarksburg, West Virginia. 
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RESULTS 


Since intelligence is positively correlated with learning ability, it was necessary 
to control for differences in full scale IQ between the groups. An analysis of covar- 
iance between IQ and the learning measurements effected statistical control for 
this factor. Results of the analyses of the learning scores revealed statistically sig- 
nificant differences between the groups. Table 1 shows this difference to be signifi- 
cant for all learning tasks except verbal reasoning. The Verbal Group achieved 


greater proficiency with the verbal tasks, and the Performance Group was superior 
in nonverbal situations. 


TABLE 1. CoMPARISON OF THE LEARNING SCORES OF THE THREE IQ Grovups* 








Learning Task F 





Paired Associates 3.39 
Verbal Reasoning 1. 

Formboard (time) 0. 
Formboard (errors) + 
Assembly 5 





*Means adjusted for full scale IQ by analysis of covariance. 


Secondary to and derived from the first hypothesis was the prediction that sub- 
jects with no difference between verbal and performance IQ’s would show no differ- 
ence between their verbal and nonverbal learning abilities. A comparison of the 
scores on the verbal and nonverbal tasks revealed no significant differences in their 
learning abilities. 

The second hypothesis concerned Wechsler-Bellevue subtest variability and 
learning ability. In order to obtain a measure of subtest variation, standard de- 
viations of the weighted subtest scores were calculated for each subject. The stand- 
ard deviations were ranked in order of magnitude and divided at the midpoint to 
categorize groups with high and low variation. It was proposed that the group with 
relatively little subtest variation would achieve higher learning scores than the group 
with large subtest variation. To test this prediction, analyses of variance of the 
learning scores were calculated with covariant control for full scale IQ. The results 
of the analyses support the hypothesis. Table 2 shows the group with little variation 


obtained significantly higher learning scores than the group with relatively large 
variation. 


TABLE 2. COMPARISON OF THE LEARNING SCORES OF THE NEUROPSYCHIATRIC 
Supsects with High anp Low VARIABILITY ON THE W-B SvusrTeEsts 
Groups EquaTep For IQ* 








Learning Task F P 





Paired Associates ns 
Verbal Reasoning : .04 
Formboard (time) : .001 
Formboard (errors) : .05 
Assembly ns 





*Adjustments by analysis of covariance. 


A finding consistent with the second hypothesis was the general superiority of 
the group with no verbal and performance IQ difference in mastering both verbal 
and nonverbal learning tasks. The results are in agreement with the hypotheses 
since significantly more subjects in the Equal Group were classified in the division 
with little subtest variation. This result points to a possible relationship of the 
equality of verbal and performance IQ’s to uniformity in subtest patterns. Relative- 


ly uniform intellectual functioning may be interpreted as indicative of comparatively 
proficient learning ability. 





ORISON S. MC LEAN 


CONCLUSIONS 


Within the limitations of the design of the problem and the samples of subjects 
studied, the following conclusions were found: 


1. Verbal IQ is indicative of ability to learn in verbal situations. 

2. Performance IQ is indicative of ability to learn in nonverbal situations. 

3. Equality of verbal and performance IQ’s is indicative of no difference in 
verbal and nonverbal abilities. 

4. Subtest variation is inversely related to learning proficiency. 
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ADJUSTMENT AND THE DISCREPANCY BETWEEN THE PERCEIVED 
AND IDEAL SELF 


BERNARD CHODORKOFF 
Dearborn VA Hospital! 


PROBLEM 


The changes that occur in the relationship between the perceived and desired 
or ideal self at different stages of therapy have been demonstrated by research in 
client-centered therapy“. It has been found that the perceived self bears little 
resemblance to the ideal self at the outset of therapy. During the process of therapy, 
the congruence is somewhat greater. At the conclusion of therapy, the perceived 
self bears much relationship to the desired self. The change in the perceived self is 
gradual and directional, bringing it increasingly closer not only to the self-ideal at 
each point, but closer to the self that was wanted before therapy ®: »- 7®. 

These findings seem to be characteristic of successful therapy. Inasmuch as 
improved personal adjustment must be one of the concomitants of successful therapy, 
the following hypothesis was set up to be tested: The greater the correspondence 
between the perceived self and the ideal self, the more adequate the individual’s 
personal adjustment. 


METHOD 


The subjects for this research were 30 male undergraduates who were taking 
introductory psychology at the University of Wisconsin. Their ages ranged from 18 
to 29 years; mean age was 21.4 years with a sigma of 1.92 years. None of the sub- 
jects had ever been, or was currently involved in counseling or psychotherapy. 

Each subject filled out a Biographical Inventory and was administered the 
Rorschach and the Thematic Apperception Test. One week after these had been 
completed, the subject was given a Q-Sort of 125 short self-descriptive statements, 


1The statements and conclusions published by the author do not necessarily reflect the opinion or 
policy of the Veteran’s Administration. 
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and instructed to sort these so that they described himself (perceived self).2 A day 
or two later, the subject sorted these same items, but this time describing the person 
he would like to be (ideal self). 

Prior to the administration of the Q-Sort procedures, each subject was rated 
for adequacy of personal adjustment. The ratings were based on the clinical evalua- 
tion of the projective and biographical material. The rating scale employed con- 
sisted of 11 subscales, each of different content. The subject’s adjustment score 
was derived by converting his rating on each scale into a standard score based on 
the distribution of the 30 ratings for that scale. For each subject the sum of the 
standard scores on the 11 subscales comprised his adjustment score. 

The subject’s sorting of his perceived self was correlated with the sorting of the 
ideal self. The product-moment r obtained was transformed to Fisher’s z and it was 
this score that was used as the measure of correspondence (correspondence score) 
between the perceived and ideal self. 


RESULTS 


Examination of the scattergram plot of the data made it clear that the relation- 
ship between the adjustment and correspondence scores was a curvilinear one. Eta 
was therefore employed to test the significance of this relationship. 

A value of .51 was obtained for Eta-squared (n?). Following the procedures 
outlined by Peters and Van Voorhis® 7? was transformed to é& and then further 
corrected so that the resulting value would be free from bias, independent of the 
size of the sample, and independent of the number of classes into which the sample 
was divided. Corrected ¢ was equal to .42; this is significant beyond the .01 level. 


Figure 1 presents a graphic description of the curvilinear relationship, and is based 
on the scattergram plotted. 





Fig. 1. Tae RELATIONSHIP BETWEEN ADJUSTMENT 
AND CORRESPONDENCE Scores. THE NumpBeErs 0 - 6 
REPRESENT INCREASING ADEQUACY OF ADJUSTMENT 
AND INCREASING CORRESPONDENCE BETWEEN THE 
PERCEIVED AND IDEAL SELF. 


ADJUSTMENT 





ee ee ee 
CORRESPONDENCE 








DIscussION 


It was predicted that the more adequate the adjustment of the individual, the 
greater the correspondence between his perceived and desired self. The results indi- 
cate that this prediction represents only part of the total picture. In the first seg- 
ment of the curve shown in Figure 1, it can be seen that the correspondence between 
the perceived and ideal self increases as adjustment becomes more adequate. In the 
other segment of the curve the opposite is true. From here on the correspondence 
between the perceived and ideal self decreases as adjustment becomes more adequate. 

The findings from research in client-centered therapy suggested to us that the 
subjects with the poorest adjustment would be the ones to show the least similarity 


*The procedures (with the exception of the ideal self sorting procedure) and the instrumentation 
have bene described elsewhere“). They will not be presented in detail here because of this. 
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between their perceived and ideal selves. This suggestion was not upheld by the 
data. The most poorly adjusted subjects in our group were found to have low cor- 
respondence scores, however, there were some better adjusted subjects whose cor- 
respondence scores tended to be even lower. 

A qualitative perusal of the data indicated that there was more overlap between 
the better and poorer adjusted subjects in terms of low correspondence scores, than 
in terms of high correspondence scores. In contrast, those subjects who showed the 
highest correspondence scores tended to be the ones rated as the most adequate in 
their adjustment. 

An interesting possibility is suggested by these findings and would be worth- 
while investigating further. Of the individuals who are rated as relatively adequate 
in adjustment, there may be two types. One type perceives himself in a satisfying 
way; this type shows high correspondence between his perceived and ideal self and 
feels no need for change. The other type apparently is dissatisfied with himself; he 
presents a self-ideal which is discrepant with his perceived self because he is moti- 
vated to change in a direction which will be more satisfying to him. 

The more poorly adjusted person is dissatisfied with himself and maintains a 
self-ideal which is discrepant with his perceived self. However, he probably is not as 
motivated for change as is the better adjusted person. The difference in motivation 
for change may account for the tendency of the most poorly adjusted subjects to 
show correspondence scores which are low, but not as low as the better adjusted 
subjects of the second type. 


SUMMARY 


The following hypothesis, based on the findings stemming from research in 
client-centered therapy, was set up to be tested: The greater the correspondence 
between the perceived and ideal self, the more adequate the individual’s personal 
adjustment. 

The results showed that a significant curvilinear relationship existed between 
adjustment and degree of correspondence between the individual’s perceived and 
ideal self. The curvilinear regression of adjustment scores on correspondence scores 
showed that as adequacy of adjustment decreased, correspondence between per- 
ceived and ideal self decreased too, until a point was reached where from then on 
adequacy of adjustment increased as correspondence scores decreased. However, 
the level of adequacy of adjustment did not rise to the level found for the subjects 
with high correspondence scores. 

It can therefore be seen that caution must be taken in interpreting correspond- 
ence between perceived and ideal self as reflecting adequacy of adjustment. Al- 
though the most adequately adjusted subjects showed the highest correspondence 
between perceived and ideal self, the least adequately adjusted subjects did not 
show the least correspondence. A tentative interpretation of these results was 
framed in terms of suggestions for future research. 
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ACCEPTANCE OF SELF AND OTHERS, AND ITS RELATION 
TO THERAPY-READINESS 


WILLIAM F. FEY 
University Hospitals 


Madison, Wisconsin 


INTRODUCTION 


This brief paper is intended to augment the growing understanding of the 
relationship between self-acceptance and acceptance of others. Theorists in the 
field of psychotherapy, such as Allen, Sullivan, Fromm, Rogers, Adler and Horney 
all posit some basic dependence between these attitudes; Fromm-Reichman epitom- 
izes the viewpoint in stating that “ . . . one can love others only to the extent that 
one loves oneself” ®: »- 16), Sheerer supplied the early experimental verification of 
this position by having judges rate verbatim excerpts from therapy and demonstrat- 
ing “a substantial and statistically significant positive relationship between ex- 
pressed attitudes of acceptance of self and the expressed attitudes of acceptance of 
others” ®. ». 178), Stock confirmed these results by having judges rate entire inter- 
views for overall attitudes of acceptance of self and of others; correlations of +.38 
and +.66 were found. Phillips“) converted Sheerer’s criterion descriptions into 
simple statements to form a questionnaire of fifty items, with half the items referring 
to self-attitudes and half to attitudes toward others. The questionnaire was given to 
two groups of college students and to two groups of high school students; correlations 
between self-acceptance and acceptance of others scores ranged from +.51 to +.74. 
MclIntyre® used the Phillips questionnaire in conjunction with a sociometric device 
to explore the Rogerian hypothesis that the self-accepting individual will tend to 
have better interpersonal relationships. Although McIntyre was unable to confirm 


this hypothesis, he did find a +.46 correlation between scores of self-acceptance 
and of acceptance of others on the Phillips instrument for his population of: 112 
college students. Finally, Berger“? constructed new scales of self-acceptance and 
acceptance of others, building largely from the Sheerer criteria. These scales were 
then administered to groups of college students, prisoners, speech problems and an 
adult YMCA class. The correlations between expressed attitudes of self-acceptance 
and acceptance of others ranged from +.36 to +.69. 


PROBLEM AND METHOD 


Previous studies in combination hint that the diversity of correlations between 
these two attitude-systems is perhaps not due to random influences alone but may 
actually reveal important variations in the individuals sampled. If the acceptance 
of others rests upon genuine self-acceptance, for example, one might hypothesize 
that a disparity between these two sets of attitudes betrays the operation of de- 
fensive mechanisms. Moreover, it is possible that such atypical intrapersonal ar- 
rangements might be confirmed and perhaps clarified by a study of the individual’s 
expressed interest in psychotherapy, as some measure of his acceptance of the 
psychological status quo. Therefore, the writer wished to examine the relationship 
between expressed attitudes of self-acceptance and of acceptance of others on the 
one hand, and expressed readiness for or interest in psychotherapy on the other. 

From various empirical and a priori sources, items for three separate scales were 
constructed—one to measure expressed attitudes of self-acceptance (abbreviated 
hereafter AS, containing 44 items), one for expressed attitudes of acceptance of 
others (AO, 36 items), and one for expressed attitudes of readiness for therapy 
(Rx, six items). Sample items from the three scales respectively are these; ‘I often 
kick myself for doing something dumb;” “I wish people would be more honest with 
you;” and “I’d welcome a chance to have some personal counseling.” These 86 
items were randomized and compiled into a mimeographed questionnaire. For each 
item a range of five answers was possible, from 1 (“‘very true of me”) to 5 (“not at 
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all true of me’’). Split-half reliabilities for each scale were computed and appear in 
Table 1. The questionnaire was answered anonymously by sixty members of a 
freshman medical class. 


RESULTS 


Table 1 summarizes the data obtained. In addition, the following relationships 
appeared: 

1. Expressed attitudes of self-acceptance are firmly and positively related to 
expressed attitudes of acceptance of others. The product-moment correlation is 
+.40, which differs from zero at the .01 confidence level. 

2. Expressed self-acceptance is not significantly related to the expressed 
readiness for therapy, for here r = —.25, P exceeds .05. 

3. Expressed acceptance of others is not significantly related to the expressed 
readiness for therapy; r = +.18, P exceeds .05. 

4. An interaction effect is present between AS and AO scores for the individuals 
who are least interested in therapy are those who, relatively, express high accept- 
ance of themselves and low acceptance of others. The significance of this effect may 
be expressed by subtracting each individual’s AO score from his AS score; these 
“discrepancy” scores correlate with the expressed desire for therapy —.45, P = .01. 


TABLE 1. SuMMARY OF DaTA FOR THE THREE SCALES 








No. of Relia- 
Items bility Range Mean 


44 92 114-210 167.33 
36 -76 104-166 134.60 
6 84 7-30 16.16 























DIscussIoNn 


The significant AS x AO correlation, while lower than several reported in earlier 
studies, conforms to expectation. When the scattergram of AS scores against AO 
scores is studied, the kind of patterning of these measures which impairs their cor- 
relation becomes evident. The desire for therapy is not particularly low along the 
major correlation diagonal, which suggests that unsatisfying adjustments are to be 
found even where AS and AO scores are roughly equal. Thus, it appears that the 
relationship between AS and AO scores is more sensitive to the character of one’s 
adjustment than to its adequacy. 

The two non-significant correlations involving Rx scores, while each bears the 
predicted sign, suggest that the relationships among these variables are more com- 
plex; and, indeed, it has been shown that neither the AS nor the AO score predicts 
therapy-readiness as we!l as does the discrepancy between them. 

A quadrant-analysis of the AS x AO scattergram mentioned above has pro- 
vocative implications which bear a measure of clinical validity. This scattergram 
may be quartered by two coordinates, each dividing in half the distribution of 
scores on the AS or AO scales. If the persons falling into each quadrant are taken 
to be a group, the differences among their mean Rx scores may be tested for sig- 
nificance. When this is done, only the high AS — low AO group distinguishes itself 
from the others; the chance probability of their relative indifference to therapy is less 
than .01. These individuals suggest the type of adjustment wherein one defends 
himself by projecting deficiencies upon and by disparaging others while asserting 
complacency for oneself. Such a person perhaps dares not concede the possibility 
that he might profit from therapy and would very likely prove a difficult candidate 
if he should enter it. 
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In the quadrant diagonally opposite this group appear individuals whose 
self-concepts suffer from contrast with their view of others. These persons depre- 
ciate themselves and can readily tolerate the prospect of some change in themselves. 
If the high AS — low AO group first described might be considered theoretically 
prodromal to paranoia, this second group (high AO — low AS) presages the path- 
ological self-disparagement of clinical depression. 

Towards the ends of the major correlation diagonal lie the remaining two 
groups. At the lower end are the hard, cynical realists who find little to applaud in 
themselves or in others. Those at the upper end (high AS — high AO) portray 
relatively the “best of all possible worlds” outlook. For these two groups the clinical 
models are less convincing. One wonders, however, what the general effect of 
therapy would be upon such a constellation of persons—if it might tidy up but con- 
tract the major correlation diagonal. Do the healthiest people cluster toward the 
— of such a diagram, surrounded by concentric gradients of increasing path- 
ology? 

It is perhaps well to note finally the persons who profess to think well of them- 
selves and of others but who are not significantly free of a desire for change. This 
finding should dispel the illusion that an individual’s expressed attitudes are necess- 
arily his determining or “real’’ attitudes. Indeed, there is little doubt that two per- 
sons, by giving the same answers for quite different psychological reasons, may find 
themselves in the same quadrant. It is less likely, however, that they will share the 
same Rx score. Perhaps the best one may hope is that “real” attitudes can some- 
how be lawfully inferred from an expanding knowledge of the patterning of ex- 
pressed attitudes. 


SUMMARY 


Seales were devised to measure expressed attitudes of self-acceptance, of 
acceptance of others, and of the readiness for therapy. Data from sixty freshman 
medical students were obtained and analyzed. A significant positive relationship 
exists between scores for self-acceptance and acceptance of others. Neither self- 
acceptance nor acceptance of others scores are related significantly to the expressed 
readiness for therapy; this readiness is, however, firmly correlated to the discrep- 
ancy between self-acceptance and acceptance of others scores. The clinical validity 
of these relationships and their implications were briefly discussed. 
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SOME FACTORS INFLUENCING THE UNRELIABILITY OF 
CLINICAL JUDGMENTS* 


FRANKLYN N. ARNHOFF, PH.D. 
Northwestern University and Downey V. A. Hospiial 


PROBLEM 


The unreliability of clinical judgment is well known, particularly in the field of 
diagnosis“: ©), Little has been done, however, toward any analysis of the factors 
responsible for this unreliability in order that it may be understood, controlled and 
corrected. Magaret ) points out that we need both a philosophy of diagnosis and a 
sophisticated understanding of its nature. Hunt, Wittson and Hunt“ © suggest 
that our understanding of clinical judgment might be furthered if we conceived of 
it, not as a unique and special kind of professional performance, but as one example 
of the broader phenomenon of human judgment in general. The present study takes 
this approach, and studies the effect upon the clinical judgment of both the pro- 
fessional experience of the judge, and of anchoring the scale which he is using to 
make his judgment. Both experience®: *° and anchoring ®: *) have been shown to 
influence judgment in a wide range of situations varying from those of classical 
psychophysics to the judgments of the prestige of occupations and the undesirability 
of certain forms of behavior. 

We assumed that clinical judgments might show the same relativism that has 
been demonstrated in other fields of judgment and that this relativism might be 
contributing to the unreliability of the clinical judgments. Specifically we proposed 
three hypotheses: 

(1) That introducing an anchoring stimulus at either end of the stimulus con- 
tinuum would cause a shift in the judged value of the stimuli being evaluated; 

(2) That these anchoring effects would be a function of the experience of the 
judges, with the most experienced judges showing the least shift; and 

(3) That the reliability of the judgments, here defined as inter-judge agree- 
ment, would also be a function of experience, with the most experienced judges 
showing the greatest reliability or agreement. 


SUBJECTS AND MATERIALS 


In order to test the effect of experience, three groups of judges were selected 
from three separate levels of clinical training. Sixty were undergraduates who had 
just completed a course in abnormal psychology; sixty were graduate students 
interning during a clinical psychology training program, and sixty were professional 
clinicians with four years or more on-the-job professional experience. As stimuli, 
schizophrenic responses to items on the Wechsler-Bellevue and Terman-Binet 
vocabulary tests were used and the subjects were asked to rate these on an 11-point 
scale for the severity of the disorder in the thinking processes exhibited in the 
responses. 


PROCEDURE 


For the construction and equation of the two stimulus series used in the ex- 
periment, it was first necessary to obtain a number of stimuli whose stimulus values 


*This paper is a condensation of a longer one submitted to the Graduate School of Northwestern 
University in partial fulfillment of the requirements for the Ph.D. degree. The complete thesis, con- 
taining a more detailed statement of procedure and all the relevant data with their complete statistical 
analyses, is available from University Microfilms, Ann Arbor, Michigan. The study is part of a larger 
project being conducted at Northwestern University under Professor William A. Hunt through con- 
tract 7onr-45011 with the Office of Naval Research. The opinions expressed, however, are those of the 
author and do not represent the opinions or policy of the Naval service. Thanks are due Professor B. 
——— for assistance with the design, and Dr. J. W. Cotton for assistance with the statistical 

yses. 
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were known. 222 schizophrenic responses to vocabulary items, judged by a group of 
3 trained clinicians to cover all possible values of confusion in thinking in such re- 
sponses, were rated on an 11-point scale by another 22 experienced clinicians of at 
least four years professional experience. These judges were not used in the experi- 
ment proper, serving only as a standardization group. The means and standard de- 
viations of the stimuli were then computed. This furnished a group of standardized 
stimuli from which two roughly equivalent stimulus series were constructed. Each 
series contained 10 items—2 of each representing scale values 4, 5, 6, 7 and 8. The 
stimuli thus represented only the middle ranges of the continuum of “confusion” 
since scale values 1, 2, 3, 9, 10 and 11 were not represented in the series. It was 
felt that the use of a limited range of stimuli would offer more room for movement or 
shifts under the anchoring conditions. 

All sixty subjects at each of the three experience levels were given the first 
series with the same instructions used for the original standardization group of 22 
clinicians. Following this, each group of sixty subjects was split into three sub- 
groups of 20 each. Each of these sub-groups then were presented with the second 
series of stimuli. The first sub-group received the previous instructions but with an 
anchor at the high end of the scale. This was done by adding the following to the 
standard instructions: “In order to further assist you in defining the scale, we will 
give you the following as an illustration of a response which represents the category 
eleven: FABLE: Trade good sheep to hide in the beginning.” The second group 
received the stimuli with an anchor at the low end of the scale as follows: “In order 
to further assist you in defining the scale, we will give you the following as an illus- 
tration of a response which represents the category one: GAMBLE: To take a 
chance, a risk.”” The third group served as a control and got the second series with 
no anchor and no change in instructions. 

In processing the data, the mean and standard deviation of each subject’s 
ratings of each stimulus series was computed. These obtained means and standard 
deviations then were themselves treated as though they were raw scores and 4 mean 
and standard deviation for the distribution of means and the distribution of standard 
deviations were computed. This was done with both stimulus series for each ex- 
perience level of 60 subjects, and for each sub-group of 20 within the experience 
levels. The combining of the three sub-groups at each experience level on the second 
stimulus series was justified by the fact that no significant effects for the anchoring 
conditions were found within any experience level. 

Comparisons of results between sub-groups within an experience level as well 
as between experience levels on a single scale were made by analyses of variance. 
Overall comparisons for all sub-groups and experience levels on both scales, were 
made by analyses of covariance.' Bartlett’s Chi Square tests were used for determin- 
ations of homogeneity of variance. As measures of reliability of the judges’ ratings, 
the standard deviations were used as primary measures, and were supplemented by 
an r as recommended by Johnson ©: »- 154), 


RESULTS 


Analysis of covariance of the mean ratings for the various sub-groups (anchor- 
ing conditions) and the three experience levels failed to demonstrate significance, 
indicating that the introduction of the anchoring stimuli failed to produce changes 
that were any greater or less than those occurring in the control groups. This was 
found ga for all three experience levels. Thus our first hypothesis was not sub- 
stantiated. 


Since no anchoring effects appeared, our second hypothesis (that anchoring 
effects would be a function of experience) becomes meaningless. 

To test our hypothesis concerning reliability we compared the variances of the 
mean ratings of the judgments at each of the three experience levels on the first 


1Jn the interests of brevity, only the results of our statistical analyses are included here. The 
complete statistical treatment of data is available on microfilm as mentioned in the introductory note. 
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stimulus series. The data are presented in Table 1. There were no significant differ- 
ences between the means. Comparison of the variances of these means by a Bartlett’s 
test for homogeneity of variance, however, yielded a X? of 16.94 for 2 d.f., which is 
significant beyond the .01 level. While this finding supports our hypothesis regarding 
differences in reliability (inter-judge agreement) between the three experience levels, 
the results are a complete reversal of the predicted direction. Our professional clin- 
icians are least reliable, our trainees next, and the undergraduates most reliable. 
As stated above, the absence of any demonstrable effect from our anchoring 
conditions enabled us to combine the sub-groups of 20 at each experience level for 
stimulus series 2 and treat the data for reliability as done with stimulus series 1. 
These data also are found in Table 1. Again there were no significant differences be- 


TABLE 1. MEANS AND STANDARD DEVIATION FOR EXPERIENCE LEVELS BasED 
Uprpom Megan Ratina By INDIVIDUAL JUDGES 








Groups Scale 1 Scale 2 





Standard Deviation Mean Standard Deviation 





Clinicians 5.86 1.28 6.30 1.25 
Trainees 5.84 .99 6.51 








Students 6.15 .74 6.41 .94 








tween the means, but a Bartlett’s test comparing the variance of the means gave a 
X? of 7.65 for 2 d.f., significant at the 5% level. This time, however, professional 
clinicians and trainees reversed positions. The trainees were least reliable, the pro- 
fessional clinicians next, and the undergraduates again most reliable. 


As mentioned before, correlations were computed using Johnson’s formula 
(6, p, 184) as a further measure of the reliability of the judgments. While there is no 
adequate method for evaluating the significance of the differences between the r’s 
obtained, inspection shows the previous findings to be confirmed. The undergrad- 
uates showed the greatest inter-judge agreement, the trainees less, and the pro- 
fessional clinicians least. 


Discussion 


Within the limits of this experiment and for this type of judgment, professional 
experience and training would appear to result in lowered reliability. At first glance, 
this might seem to be a disastrous reflection upon clinical training. Upon further 
consideration, however, the results are quite understandable in the light of the in- 
creased possibility of differing self-instructions, differing interpretations of the 
standard instructions, etc., for our experienced groups. It may well be that increased 
training and professional experience provides the experienced clinician with multiple 
frames of reference against which to evaluate behavior. These multiple frames of 
reference provide diverse grounds on which the actual judgment may be based, as 
was obvious from spontaneous comments offered by our experienced clinicians. 
“Clang” associations were sometimes viewed as not “severe,” paranoid thinking was 
not considered “disordered”’ by some, and some subjects indicated that they had 
made their judgments not on the severity of the disorder exhibited, but on its indica- 
tion for therapeutic accessibility or on its prognostic value for recovery. 


There is perhaps one homely caution that may be drawn from these results if our 
interpretation is correct. When dealing with experts in a judgmental situation, the 
task should be well defined and the criteria set forth clearly. Otherwise the riches of 
knowledge may yield confusion rather than clarity. 
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SUMMARY 


Subjects with different degrees of professional clinical experience rated schizo- 
phrenic Wechsler-Bellevue and Terman vocabulary responses on an 11-point scale 
for degree of disorganization of thinking. Anchoring values were introduced as a 
means of influencing the judgments made. Specific hypotheses were advanced re- 
garding the effects of experience and anchoring upon the judgments made. No signi- 
ficant results due to anchoring could be demonstrated. Inter-judge agreement was 
found to decrease as a function of increasing experience. 
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ANGER REACTION IN PARANOIDS 
VERNON W. GRANT 
Hawthornden State Hospital 


Thorne“? has called attention recently to the relative neglect, in our literature, 
of studies devoted to abnormal mental states dominated by anger and hostility. 
He proposes that the kinds of frustration which stimulate acute and chronic anger 
reactions be regarded as primary etiological factors, and that a corresponding 
psychiatric category should be provided for diagnostic purposes. He offers observa- 
tions concerning the relation of anger reactions to the psychology of the paranoid 
symptom, or process. The purpose of the present paper is to confirm the need for 
recognition of the “frustration-anger-hostility” state, to offer a few illustrative 
materials, and to formulate some suggestions as to the relation of anger reaction to 
paranoid psychology. 

The stimulus to these observations was the opportunity to study fairly intens- 
ively several psychotic individuals in whom expression of sustained hostility was 
unquestionably the outstanding feature of mental disorder, associated with evidence 
of the relation of this affect to delusional reactions. In each case the chronic anger 
could be correlated with frustrating circumstances or experiences. The anger re- 
action observed exhibited the following characteristics: (1) in one form or another it 
was more or less continuously manifested over periods of months or years; (2) it 
colored nearly all social perceptions and seriously impaired—at times entirely 
destroyed—capacity for objectivity of judgment; (3) its strength and pervasiveness 
was such as to overwhelm any attempt at insight therapy, even with highly intelli- 
gent individuals. 
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The question may be raised whether, where the anger reaction may be seen as 
secondary to anxiety, for example, the classification should not more properly be 
made in terms of the more basic affect. It may be proposed, on the other hand, that 
in certain instances the reaction to anxiety does happen to be anger, or hostility; that 
this hostility may be the outstanding feature of the clinical picture, and that hostility, 
rather than anxiety, may be genetically related to delusional formations. Why, in 
these cases, anxiety led to anger reaction may not be clear, since the same affect may 
in other cases precipitate other kinds of psychoneurotic or psychotic states. Perhaps 
constitutional factors underlying anger reactivity may here be regarded as more 
truly basic than the anxiety itself. The central fact seems to be that, whatever its 
origin, the product of the pathologic process may be a disabling and chronic anger. 


CasE MATERIALS 


Case 1. H.J., female, age 46, high school education, married. Hospitalized because of homicidal 
threats and suicidal moods; extreme hostility toward her neighbors; repeatedly struck a child at 
play before her home; tore phone from the wall when unable to complete a connection; made 
delusional accusations against neighbors: e. g., that they spied on her, entered the house in her 
absence, were running an “abortion racket” nearby. She believed her food poisoned; became 
seclusive, rarely left the house. 

The patient has been chronically hostile during two years of hospitalization. This hostility 
is the primary feature of the illness. She has been consistently sullen, resentful and challenging 
during interviews. Her performance during tests was conspicuous for truculent negativism. Ex- 
pression of oppositional tendencies was pervasive, and she refused to retract or qualify any state- 
ment despite its obvious conflict with fact or common sense. The tone of her verbalizations was 
characteristically bitter. While general ward behavior has been generally cooperative, on oc- 
casion she has exhibited marked “contrariness.” 

Under narcosis the patient’s verbalizations took a form indicating hostility in part directed 
against herself. She made a number of self-deprecating remarks: said that people had never liked 
her, that she had never felt comfortable with them, that it must be because of her appearance; 
she repeatedly referred to herself as a failure. There were occasional moments of insight at this 
time: “Yes, I could have misunderstood my neighbors. I’m wrapped up in myself. My neighbors 


are alright; it’s me. I’m queer. I like ~— but they don’t like me.” Such data, along with others 


were suggestive of an anger reaction developed by frustrations in the social sphere, initially 
aroused against herself, then projected, through rationalized accusations, upon others. 


Case 2. E. F., male, age 36, college graduate, single. First hospitalized at the age of 22, when 
he became depressed and suicidal about the time of graduation from college. Recovering after 
a few months of treatment, he secured a job much below his capacities and held it for several 
years. At 31 he began the study of law but again broke down. Following some family re- 
verses for which he held himself responsible he became very depressed and agitated and was 
again hospitalized. At a receiving hospital he was described as paranoid and hostile, on two 
occasions making homicidal threats against psychiatrists. 

Throughout interviews this patient was in a steadily sustained mood of bitterness, un- 
relieved but for rare moments when he was able to approach, though briefly, some degree of 
objectivity in discussing the background of his illness. The salient feature of the psychosis was 
@ smouldering rage directed, for the most part, against himself, and expressed in the form of 
profane and somewhat dramatic charges of worthlessness. He lashed himself with accusations 
of incapacity and futility. Despite high intelligence and an excellent academic record he be- 
littled every achievement and stated he was never able to remember what he learned and that he 
had never learned anything of value anyhow. He made sweeping and unqualified charges of 
failure inst himself; he likewise reviled the psychiatric treatment he had had as merely com- 
pleting the ruin begun earlier. Electro-shock “smashed the be wee Pane out of (him) . . . left (his) 
mind an absolute blank.” His body too was a target: it was “nothing but a stinking mess, rotten 
with decay ... Yes, definitely there’s an odor of decay.” During psychological tests he alternate- 
ly cursed the “asininity” of the materials and what he considered his inability to function with 
enn. He repeatedly assured the examiner that the latter was wasting time on him, that he was 
“finished, dead, a mere clinical history.” His tone was steadily anger-charged and bitter, his 
thinking consistently distorted by the mood. 

The central trauma of the P ne mete life was the personality of his father, a demanding and 
extreme perfectionist who belittled but never praised, and whose treatment was at times brutal. 
The rationale of his discipline was “training for toughness” to harden his children for protection 
against the rigors of life. In opposition, there seems to have been unquestioned evidence of love 
for them, and anxiety pee dy in all the children, who could neither accept nor entirely reject 
him. The mother strove as she could to soften the effect of his methods, but was herself much 
dominated by him. Despite the anxiety he aroused, the father, who was a man of culture and 
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intellectual ideals, became the model for his son, who accepted personal power and intellectual 
superiority as the central goals of his own aspirations. The significant conflict and ultimately the 
psychosis appear to have developed out of what the patient regarded as his failure to reach his 
objective. In childhood and youth he was seclusive, fearful of his father, who repeatedly charged 
him with stupidity. From this treatment he sought refuge in his mother, to whom he was very 
close. He was painfully shy, and says he once refused to speak before a class because he was sure 
he would collapse in fright. Much introverted, he resolved to become a solitary intellectual and 
decided he would never marry. 

At college he made the painful discovery that he was no longer exceptional intellectually; 
he studied almost constantly, suffered acute anxiety during examinations and in his last year 
broke down entirely and was unable to complete his work, though allowed to graduate because of 
his excellent record. Several years later, after entering law school, he again broke down because 
he found himself unable to reach the impossibly high standards of achievement he had set himself. 

Frustration relative to an ‘‘ego-ideal’”’ here appears as the setting of a chronic rage reaction 
directed primarily against the self, though often overflowing with little discrimination. Despite 
his intelligence he was unable to compromise; his reaction to goal-seeking was of an “all or noth- 
ing” character. Failures were in large part the consequence of paralyzing anxiety. The somatic 
delusions were expressions of self-directed rage, and also rationalizations of his defeatism, since 
conviction of his incapacity justified his renunciation of all effort. 


Case 3. R. K., female, age 26, college graduate, single. At college this girl became progressively 
seclusive and irritable, appearing to be laboring under a strain. She was released from a job, 
following graduation, because of inattention to her work. At home she was irritable and con- 
tentious in relations with her mother; once spontaneously attacked the latter, blacking both 
eyes, after announcing “I just feel like fighting.” She became depressed, went out alone and 
cut her neck superficially with a razor blade, then called home for her parents to come and get her. 

Outstanding in the patient’s behavior during the period of observation was her generalized 
hostility. Her features were characteristically mask-like and cold in expression. While the emo- 
tion did not rise to outburst intensity at any time, it was fairly continuously intrusive in verbal- 
izations, particularly showing in what the patient offered as ostensibly objective comments but 
which in reality were bitterly cynical. Her behavior on the ward and during visits to grou 
therapy was markedly negativistic; her hostility was often veiled by rather subtle devices, fae 
as studied inattention to a speaker, deliberate delay in response to a question as a gesture of rebuff 
or of aggressive indifference, or refusing acknowledgment of a greeting with a shammed air of 
abstraction. No specific or systematized paranoid ideation was evidenced, nor did the patient 
at any time appear to be hallucinated. 

As a child the patient was shy and sensitive, very obedient and inclined to be passive with 
others. The parents were undemonstrative and demanding, “rigid, unaffectionate, frequently 
rejecting”; though punitive, they were nevertheless overprotective and fostered dependency. 
The mother was the more dominating personality. The patient feared, admired and identified 
with her mother. The latter later acknowledged, with manifest guilt feelings, “unmerciful’” 
scoldings. The patient recalled resentment felt toward her mother, generated by competition 
with her for the father’s attention, though adding the qualification “It never reached murderous 
proportions.”’ She entered into various activities with the hope of gaining the attention and 
affections of her parents; she once remarked that the fact her father had paid her college tuition 
must have meant that he really had some regard for her. In social relationships she was chronical- 
ly insecure, feared unpopularity and rejection, and significantly said of girl acquaintances that she 
did not like them and they did not like her. At school she feared her teachers and was often fear- 
oe of failure because it symbolized parental rejection. She recalls the anxious feeling of “being 
ost.” 

While evidence of fear of rejection and loss is traceable here for many years, it is also clear 
that hostility came to dominate the picture. While its overt release against the mother did oc- 
casionally occur, it was for the most part held in check. During the period of hospitalization it 
appears to have been “‘sublimated”’ in the form of generally diffused and chronic irritability and 
antagonism. The effect of the conflict on rapport is expressed in the patient’s remark. “I have no 
contact with things outside me. There’s something between me and the world, a lack of feeling’ — 
illustrating, perhaps, Thorne’s observation that “The anger state penne the person from 
identifying with other people since identification implies some degree of rapport or love relation.”’ 


DISCUSSION 


Prominent in current theory of the paranoid state is the “mechanism” of pro- 
jection, by which needs and conflicts are outwardly displaced upon the social en- 
vironment in the form of a reading-into-others of the motives which in reality dom- 
inate the subject’s own behavior. While of value for preliminary orientation the 
formula does not, as Thorne notes, complete the dynamics of the paranoid process. 
We need to know, for each of the various kinds of projection, its “unconscious 
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logic” in terms perhaps comparable to those offered for the psychoanalytic account 
of paranoid symbolism in the latent homosexual. In our first case, for example, 
the interpretation suggested by the patient’s total behavior was that projections of 
hostility upon her neighbors served the purpose of a rationalization of her own re- 
sentment toward them, rather than in any literal sense a “displacement” of it. 
People did not like her, she believed, and this rejection, frustrating her social needs 
and depressing her self-esteem, aroused resentful anger. This anger could not, 
however, be acknowledged and directly expressed for what it was without humilia- 
tion; i.e., the admission ‘I don’t like them because they don’t like me” would carry 
too much of the flavor of a childish and unreasonable petulance. It would also link 
the outwardly directed hostility uncomfortably close to its painful-to-contemplate 
stimulus motive of rejection. The hostility could be more satisfyingly justified 
simply by delusional formations which enabled her to vent her anger through the 
medium of accusations: she resented them, then, because they annoyed her in var- 
ious ways. In this context her accusations ranged from mild charges, some with 
perhaps a discoverable pretext (though given exaggerated meaning) to pure fabrica- 
tion, e.g., they entered her home in her absence, or were running an “abortion 
racket.’’ The behavior here resembles that of a ‘teen-age’ boy who, denied invitation 
to a party, began to find various faults with the host, and finally concocted charges 
fairly obviously untrue. 

In our second case the dominating hostility was expressed most specifically in 
the form of somatic delusions which, while in part probably rationalizations of de- 
featism, were also direct and very bitterly toned outlets for self-directed anger. The 
subject appeared “frozen” in a kind of hate psychosis of perpetual railing against the 
defects he regarded as responsible for his failures: a poor memory, a futile intellect, 
and unaccountable seizures of anxiety at times of stress. Beyond the delusional out- 
lets, however, the hostility overflowed in indiscriminate villifications of psychiatrists 
and hospital treatment generally, of the quality of his education, of the blunders he 
had made, of his father, and of the course of the family fortunes. Chronic anger con- 
sistently colored his valuations and rendered him incapable, except for brief flashes, 
of an objective orientation toward his problem. Such clinical phenomena as this 
seem to offer the main justification for a psychotic category of “anger reaction” in 
that the affect is not only dominant, but almost continuously so, only occasionally 
receding, and never completely, into the background with situational change. 

The only distinctive feature of our third case was the apparent absence of sys- 
tematic delusional radiations of the basic hostility, but certain of the patient’s 
numerous distorted and anger-colored verbalizations were clearly oriented in the 
paranoid direction. 

Introduction of a classification for the anger reactions should not, of course, 
be taken as suggesting any categorical difference between this group and other cases 
in which the same affect makes a merely less salient component of the syndrome. 
Those cited represent selections from a larger number in which hostility is prom- 
inent, and it would not be difficult to parallel these with others in which the role and 
meaning of hostility would be equally clear, but in which it would be seen as relative- 
ly less marked as compared with other expressions of affective conflict., e.g., anxiety, 
guilt, or self-assertion, or in which paranoid formations showed more of the charac- 
ter of fulfillment mechanisms than of displacements or rationalizations. In some of 
these the anger is more effectively subdued by anxiety, and in some its expressions, 
while clearly related to paranoid processes and to overt symptomatic behavior, tend 
to be more channelized in the sense of confinement to the conflict area. The latter 
people are more like those described by the layman as “perfectly normal until you 
talk about certain things’’; the emotional state lacks the relative continuity featured 
in’ our illustrations. 

Increased attention to the dynamic role of anger in paranoid disorders may 
finally demonstrate for it a larger place than it has hitherto had. Whether, as 
Thorne proposes, a basic anger reaction will account for “most of the clinical ob- 
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servations... concerning paranoid behavior’ would seem to depend in part on what 
is included in the term “paranoid.” If we take it, as Cameron, for example, suggests 
@, 2), as standing for a tendency toward false interpretations, and if we consider it 
dynamically as driven by a motive or affective state, there appears no ground for 
limiting its function to the service of any particular need. The anger reaction illum- 
inates instances of transfusing and generalized hostility, and likewise instances in 
which, by way of rationalization, hostility is imputed to others as a justification for 
aggression, but question may be raised as to whether all perceived hostility must be 
seen as a projection of this emotion. Persecutory ideation, for example, may ap- 
parently be set going by a deep-lying guilt “complex” which sensitizes the victim 
to a variety of neutral stimuli and transforms them into threats. 

A more analytic treatment of the concept of projection itself would be helpful 
in delineating the psychology of the paranoid symptom. Too often it is treated as a 
“mechanism” of simple displacement of motive from one individual to another as if 
it were a psychologically elemental mode of defense. A woman’s delusion of her 
husband’s infidelity interpreted, for example, as an outward reflection of an un- 
acknowledged impulse of the same kind becomes more intelligible when viewed as a 
means of guilt relief by “equalization,”’ than it does when merely assimilated to a 
general phenomenon of protective transfer of motive from one to another. (A differ- 
ent “‘dynamic”’ of such a delusion is obviously possible: the intention is only to ex- 
emplify a more explicit mode of treatment.) Further illustration might be made of 
the psychotic homosexual who not only complained of the advances to which he was 
frequently subjected by other patients but also said he regarded almost the entire 
hospital population as homosexuals (‘This whole place is a ‘fruit-farm’, isn’t it?’’) 
and likewise the majority of all male humanity, suggesting that the projection was 
here in part an attempt to redeem himself, in a sense, by including as many others 
as possible within the scope of his own aberration. 


That delusional tendencies may be associated with various motives, e.g., erotic, 
or egoistic, seems accepted. Admittedly there is an arbitrary element in the defini- 
tion of “paranoid,” a term which, as Cameron notes, has in the past had changing 
meanings. If it is true, as he suggests, that in current usage it is equivalent to ‘‘de- 
lusional,’’ it may seem best to consider delusional interpretations and kindred per- 
ceptual distortions as a general phenomenon which may be motivated by any strong 
need or affect. 


SUMMARY 


A recent proposal of the need of a new psychiatric classification for pathological 
states characterized by anger and aggressive reactions is discussed. Case descrip- 
tions are presented illustrating pathological behavior in which anger and hostility 
appear to be the outstanding and chronic feature. The dynamics of these illustra- 
tive cases are suggested. Some comments on Thorne’s observations concerning the 
relation of anger reactions to paranoid behavior are offered. 
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MEASUREMENT OF HOSTILITY: A PILOT STUDY 
SHANNA MC GEE 


Western Reserve University 


PROBLEM 


The field of attitude measurement is still in its infancy, and adequate tools are 
not yet available to the clinician for measuring many of the attitudes that are cru- 
cial in interpersonal relationships. One such attitude whose measurement barely 
has been attempted is hostility. 

One of the pioneering attempts to measure hostility was made by Adorno et 
al.“) In his work the hostile person is pictured as fundamentally an insecure in- 
dividual with low ego strength and an overactive super ego. He protects himself 
from the guilt feelings which arise as a result of his feeling of moral inadequacy by 
projecting these moral inadequacies onto others. This allows him to express hostility 
toward other people, in the guise of being a guardian of morals, which, in turn, helps 
to bolster his weak ego by showing his superiority to those he is denouncing. 

This concept of a personality constellation of mingled hostility and virtue was 
used by Cook, Leeds and Callis®’ in the development of the Minnesota Teacher 
Attitude Inventory and is described by them as follows: The hostile, insecure in- 
dividual “frequently seeks security through virtue. He adheres rigidly to conven- 
tional middle-class standards. There is a tendency to be on the lookout for and to 
condemn, reject and punish anyone who violates conventional rules. All misbehavior 
is serious, to be dealt with severely, never to be passed off as a joke. There is little 
sense of humor, only a sense of justice perverted by general hostility toward people.” 

From his work on the MTAI, Cook“? developed experimental hostility and 
virtue scales (the latter not used in the present study) for the Minnesota Multi- 
phasic Personality Inventory. Of the development of these scales Cook states: 
“We gave the Minnesota Teacher Attitude Inventory to approximately 2,000 Min- 
nesota teachers for the purpose of setting up norms. We then selected the highest 
scoring 150 of these teachers and the lowest scoring teachers and gave these two 
extreme groups the Minnesota Multiphasic Personality Inventory. The present scales 
of the MMPI were not effective in distinguishing between good (friendly, non- 
authoritarian) and poor (hostile, authoritarian) teachers. However, 250 of the 
Multiphasic items were significantly different at the five per cent level. From these 
250 items I selected, according to my own judgment, those items which measure 
hostility and another sample which measures self-righteousness.”’ 

The trait of hostility may be an organized whole or a totality. On the first 
hypothesis, it is a gestalt within a person, a pool that could be tapped equally well 
by various measures. In other words, a hostile person would make a hostile response 
to any sort of hostility-provoking stimulus. On the other hand, people may com- 
partmentalize hostility—show it only toward certain racial or religious groups, or, 
as is more popular in a university setting, toward authoritarian personalities. 
Whether hostility is a gestalt or a totality, it may be treated in Lewinian terms, or, 
as will be done in this paper, in terms of Hull’s habit-family hierarchy “?. 

For example, if a subject is presented with a stimulus, say a word association 
test, he will make the overt response of a hostile word only if the stimulus word 
happens to sensitize an area in which he has hostile feelings; because if it does not, 
no response-produced stimuli will be evoked, and hence there will be no goal re- 
sponse. However, because of the high degree of stimulus generalization, one would 
expect a certain degree of diffuseness of hostile feelings. If hostility is an organized 
whole, any stimulus might trigger a hostile response. Thus, on either hypothesis, 
there should be a moderate degree of correlation between various types of measures 
of hostility. 
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The hypothesis of this paper is that hostility can be measured by objective 
tests and that significant correlations will be found between various measures of 
hostility, crude and unperfected though the present measuring devices may be. 


METHOD 


The author tested 27 undergraduates, 14 males and 13 females, with three 
different instruments, a word association test, a picture sorting test, and the Cook 
hostility scale of the MMPI. 

The rationale behind the use of word association tests is too well known to re- 
quire elaboration here. Eriksen“) gives an excellent thumbnail account of the 
theory: ‘Perceptual defense occurs for stimuli that are related to an individual’s 
unacceptable needs. Perceptual defense is in effect an extension of the defensive 
operations of the ego into perceptual function.” 

The word association test, which was administered with the directions usual 
for such tests, consisted of the following 28 words: table, throw, cut, walk, paper, 
hear, hurt, desk, strike, dance, hit, street, happy, tree, whip, talk, pencil, bite, radio, 
judge, eat, apple, suicide, sing, bruise, picture, flower, lie. Ten of these words were 
critical: cut, hurt, strike, hit, whip, bite, judge, suicide, bruise, lie. (Three of these: 
cut, bite, suicide—come from the Rapaport test ®. In an attempt to minimize the 
contamination of scores by emotions other than hostility, separate scores were com- 
puted for the responses and for the time taken to give them, and correlations run 
with both of the resulting scores. 

Three clinicians made independent judgments as to how many of the responses 
of each of the 27 subjects to the ten critical words were hostile (in ignorance of the 
time taken to make the response). The mean score of the three judges was computed 
(there was almost perfect agreement) and called the W score. This is simply the 
number of hostile responses each subject made. The median time for response was 
computed for each subject on critical and neutral words. (The median was used 
rather than the mean because it is less affected by extreme scores.) The median of 
the neutral words was then subtracted from the median of the critical words, giving 
an E (excess time) score for each subject. The product moment correlation between 
the W and E scores was .46. 

In the picture sorting test, the subjects were shown 10 Szondi pictures, which 
they were asked to rate on a five point scale ranging from Very Mean (1) to Very 
Friendly(5). Thus when these ratings were added, the higher the score the more 
friendly the “prisoners” were perceived as being, and this F (Friendliness) score 
could reasonably be expected to correlate negatively with other measures of hostil- 
ity. The subjects were then shown the remaining 38 Szondi pictures, which they 
were told were of mental hospital patients, and asked to sort them into those who 
looked Dangerous, Tricky and Deceitful, and Neither. A product moment correla- 
tion of -.32 was found between the F and D (Dangerous) scores.' 

The third test was the MMPI, and the H (Hostility) score was the number of 
items marked in the hostile direction on the Cook scale. 


RESULTS 


The product moment correlations between the various tests are given in Table 1. 
All are significant at the 5% level or better, except the E-F correlation, which reach- 
ed only the 10% level of significance. 

The correlations between the two scores on the word association test (W and 
E) and the Cook scale (H) were not significantly different from zero, nor were those 
between W and D, and E and D (the pictures judged Dangerous). 


1This picture sorting was done as a cross-validation of unpublished research work by R. W. 
Wallen on the measurement of hostility through the judgments of pictures. Scores for the Tricky 
pictures are not used in this study. 
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TaBLE 1. CoRRELATION BETWEEN Various HostiLtiry MEASURES 








W (numer of hostile responses)—E (excxss time) 46 
E—F Friendliness —.25 
Ww—F —.33 
F—D (Dangerous —.32 
H (Hostility)—F —.40 
H—D 44 





This experiment is merely a pilot study and raises as many questions as it 
answers. Why, for instance, since eight of the ten critical words dealt with physical 
violence, do the scores on the word association test correlate negatively with the 
judged friendliness of pictures but fail to show any correlation with the degree of 
dangerousness assigned other pictures? Since the MMPI and the word association 
test are both verbal measures, why was no correlation found between the hostility 
scores on these two tests when correlations in the 40’s were found between the 
MMPI hostility score and the picture sorting scores? Did the fact that the subjects 
were told the pictures were of prisoners and mental hospital inmates affect their 
judgments, and if so, how? Further studies are in progress which it is hoped will 
answer these and other questions. 

Even within the limited framework of the present study, it is significant that 
low but significant correlations were found between three different types of measures 
of hostility. The importance that a valid instrument for the measurement of hostility 
would have for clinical diagnosis or industrial selection is obvious, and it is hoped 
that further research will prove that Walter Cook has put the blueprint for such an 
instrument in our hands. 


SUMMARY 


This experiment was a pilot study to determine the feasibility of measuring 
hostility by objective tests. Three different measures of hostility, a word association 
test, a picture sorting test, and the Cook hostility scale of the MMPI were admin- 
istered and low but significant correlations found between them. However, much 
further research is needed on hostility and the degree of its compartmentalization 
before we can hope to have highly valid, reliable instruments for its measurement. 
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EFFECTS OF MEPHENESIN AND PRENDEROL ON INTELLECTUAL - 
FUNCTIONS OF MENTAL PATIENTS! 


AUDREY B. MAILER 
Traverse City State Hospital 


PROBLEM 


This study was designed to evaluate possible changes in intellectual perform- 
ance in mental patients under the influence of drugs purported to relieve anxiety. 
Mephenesin (Tolserol, Myanesin), first reported by Berger and Bradley in 1946, 
has been widely investigated as to both its physiological and its clinical effects. It 
is known to have certain anti-convulsant properties and to relieve spasticity by 
exerting a depressant action on the multisynaptic reflexes“: ®. Several clinical 
studies“ 7. 8. 9) report mephenesin to be useful in reducing anxiety in mental pa- 
tients. Prenderol (2,2-diethyl 1,3-propanediol) is also an anticonvulsant drug ®? and, 
while markedly less potent than mephenesin, it has a similar depressant action on 
multisynaptic reflexes“). No clinical studies have appeared in the literature as yet 
to illustrate the therapeutic value of this agent, but the structural similarity of the 
drugs suggests that Prenderol may prove comparable to mephenesin in allaying 
anxiety in mental patients. 


METHOD 


Fifteen newly-admitted and untreated mental hospital patients provided the 
experimental group for this study. Care was taken to omit mental defectives or 
patients suspected of organic brain damage, as well as patients over 45. These 
patients, who ranged in age from 23 to 42, included 3 males and 12 females. Of this 
group, eight were later diagnosed schizophrenic, six psychoneurotic, and one simple 
adult maladjustment. 

Approximately one week after admission, each patient was administered the 
Full Scale Wechsler-Beilevue, Form I, to establish his current level of intellectual 
function. Within a twenty-four-hour period, the patient was given 2.5 to 3.0 gms. 
of mephenesin orally, then re-tested using Form II of the Wechsler. A half-hour 
was allowed between time of ingestion of the drug and the beginning of the inter- 
view. 

Since it seemed desirable to obtain the comparative effects of mephenesin and 
Prenderol on the same group of subjects, the above-mentioned procedure was em- 
ployed a week later, using Prenderol as the experimental drug. Form I was re- 
peated in order to appraise practice effect and any spontaneous improvement which 
might have occurred in the patient during the intervening week. Later the same day, 
each patient was given 5.0 gms. of Prenderol orally in tablets and, after thirty min- 
utes, the Full Scale Wechsler-Bellevue II was administered. 


RESULTS 


All patients accepted the somewhat tedious repetition of tests without com- 
plaints. Subjective reactions to the medications were varied but in general Prenderol 
elicited the more pronounced effects. Eight subjects reported no difference in sensa- 
tion or mood tone after the ingestion of mephenesin, whereas only three could detect 
no difference after Prenderol. Five patients employed “woozy”, ‘“‘weak”’, ‘““wobbly’’, 
or “lightheaded” to describe their feelings after mephenesin, two claimed to be 
“sleepy” or “groggy’’, while only four admitted that they felt more relaxed. In one 
case, some slight slurring of speech was detected while the patient was under the 
influence of mephenesin. 


1Grateful acknowledgment is made to William H. Funderburk, Ph.D., for his advice and criticism 


on the meen ae aspects of this study and to H. Sidney Newcomer of Squibb Institute for 
Medical Research for the supply of the compounds used. 
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More striking effects were noted by the patients after Prenderol had been given. 
Right felt ‘weak’ and “‘lightheaded’’, three stated that they felt ‘“drunk’’, and six 
others complained of feeling very “drowsy” or ‘“‘sleepy”. The one patient who pur- 
ported to feel more “relaxed”’ belied this by adding that she felt nervous because she 
“felt so different’. One extremely anxious patient (who felt ‘‘no different”’ following 
2.5 gm. of mephenesin) stated that she might feel ‘‘a little better’. 

The examiner observed marked sluggishness and apathy in three patients under 
the influence of Prenderol and one other patient became so grossly incoordinated 
she was unable to walk back to her ward. Slight ataxia was common following the 
use of Prenderol. One anxious, very constricted neurotic patient, who had reported 
no effect from mephenesin, became quite garrulous following the administration of 
Prenderol and began voicing family animosities which she had hitherto been unable 
to discuss. 


Table 1 gives the mean IQ and standard deviation for the four different test 
administrations. 


TasLp 1. WecHSLER-BELLEVUE MEAN IQ’s 








Group Verbal Performance Full Scale 





Mean SD Mean SD Mean SD 
Original W-B 1 94.83 13.12 101.11 15.75 99.66 14.93 
W-B II after Mephenesin 95.95 10.61 107.06 15.48 102.73 13.53 


(one week later) 
W-BI 99.73 12.01 115.33 14.79 107.73 19.70 


W-B II after Prenderol 90.93 11.74 106.13 22.69 98.20 16.51 



































Table 2 shows the significance of the mean differences obtained for the various 
group comparisons. Group labels used are the same as those designated in Table 1. 


TABLE 2. “t” VALUES AND LEVEL oF SIGNIFICANCE COMPUTED FOR MEAN 
DIFFERENCES BETWEEN Scores On Eacu Test 








Group Mean Level of 
Comparisons Difference t Significance 





Iand II +4.73 1.86 aa 
~ Tand III +8.66 3.36 01 
II and IV 4.60 1.65 od 











IiI and IV -9.26 2.46 Ol 














DIscussIOoN 


There was no significant improvement in intellectual functioning after the 
ingestion of mephenesin, and there was little clinical effect noted by either the sub- 
jects or the examiner. By the time Form I was readministered a week later, there 
was an elevation of Full Scale IQ’s which was significant at the .01 level. In one or 
two cases the striking elevation of all sub-test scores seemed safely attributable to 
the marked improvement in mental condition which occurred in these patients. For 
the most part, as can be seen by Table 1, the rise in Full Scale I1Q’s is dependent 
upon the significant increase on the Performance Scale, suggesting that the in- 
crement is principally due to increasing familiarity with the test items. 

Because of practice effect one might expect the Full Scale quotients, and 
particularly the Performance IQ’s, to continue rising even though the experimental 
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drug Prenderol might be exerting no beneficial effect on intellectual processes; how- 
ever, this is not the case. The test administration after Prenderol resulted in a sig- 
nificant decrease in intelligence ratings due to poorer achievement on the Perform- 
ance Scale. 


Several possible reasons for this impaired intellectual functioning after Pren- 


derol are: 


(a) Muscles are frequently relaxed to the point of poor coordination. 


(b) Motivation decreases as patients become more relaxed; they become more 
indifferent to time pressures. 


(c) Many patients appear distracted by their unfamiliar sensations of dizzi- 
ness, grogginess, etc. 


It is probable that a milder dosage of Prenderol would eliminate these undesir- 


able side-effects but there is no reason to assume that it would result in clearer think- 
ing on the Wechsler-Bellevue. 


—_ 
. 


ef 7? oP fF 


SUMMARY AND CONCLUSIONS 


Fifteen mental hospital patients (3 men and 12 women) were subjects in 
a study to determine the effects on intellectual] function of two structurally 
similar drugs, mephenesin and Prenderol. 


The Wechsler-Bellevue, Forms I and II, were administered in this sequence: 
Form I, Form II using mephenesin, Form I a week later, and Form II using 
Prenderol. 


Results show a significant improvement in scores, largely on the basis of 
practice effect, from the first to the second administration of Form I. No 
significant improvement was found following use of mephenesin, but there 
was a decrease in Full Scale 1Q’s following the use of Prenderol which was 
significant at .01 level. 


On the basis of these results, there appears to be no evidence that there is 
improved mental functioning following the use of these two drugs alleged 
to reduce anxiety in mental patients. 


Further studies might produce more favorable results by altering dosage of 
the drug or testing after protracted administration of the drug. 
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THE SERVICEABILITY OF MILITARY PERSONNEL OF LOW 
INTELLIGENCE* 


WILLIAM A. HUNT CECIL L. WITTSON 
Northwestern University University of Nebraska College of Medicine 
AND 
EDNA B. HUNT 


PROBLEM 


In assessing the military serviceability of marginally adjusted individuals, two 
types of criteria are commonly used: the number of individuals within the marginal 
group who fail to complete a stated period of military service (i.e. are discharged for 
neuropsychiatric, other medical, or disciplinary reasons before service is completed), 
and various performance measures (i.e. incidence of hospitalization or disciplinary 
difficulty, etc.) for those who do complete the stated term of service. In two pre- 
vious studies ®: *) applying the second type of criterion the authors have shown that 
individuals of low intelligence, defined as a mental age of approximately 12 years or 
less, who successfully completed a term of military service, nevertheless had a higher 
incidence of hospitalization and of disciplinary difficulty than did a control group 
of individuals of higher intelligence. The present study applies the first type of 
criterion, discharge or attrition rate, to a comparable sampling of some 597 Naval 
recruits of low intelligence. 

METHOD 


This experimental population is made up of recruits who were studied upon the 
observation ward of the Psychiatric Unit at the U. 8. Naval Training Center, New- 
port, R. L., during the years 1942 and 1943. All had a mental age of 12 years, 6 
months or below as established by psychological testing by qualified psychologists. 
For purposes of further analyses they were separated into groups in which low intelli- 
gence was the only difficulty present, and groups in which other neuropsychiatric 
symptomatology was present in addition to the low intelligence. In the 1942 samp- 
ling there were 165 cases merely showing low intelligence (MA range from 10 years 
to 12 years, 6 months with a mean of 11 years, 3 months) and 157 cases of low in- 
telligence plus other psychiatric symptomatology (MA range from 9 years, 6 months 
to 12 years, 6 months with a mean of 11 years, 4 months). In the 1943 sampling 
there were 67 cases of low intelligence only (MA range from 8 years, 9 months to 12 
years, 6 months with a mean of 11 years, 1 month) and 208 cases of low intelligence 
plus other symptomatology (MA range 9 years, 4 months to 12 years, 6 months with 
a mean of 11 years, 2 months). All were adjudged capable of rendering military 
service and were sent to duty from the observation ward. No entry was made on 
the recruit’s health record so his subsequent career was not prejudiced by the find- 
ings on the ward. The cases represented all those available under these criteria in 
the Newport records for 1942 and 1943. As a control group, an equal number of 
“normal” individuals were randomly selected from among those men studied on the 
ward at the same times but adjudged to present no evidence of either low intelli- 
gence or psychiatric symptomatology and who were consequently sent to duty. 

With the cooperation of the Naval Medical Records Office, Garden City, Long 
Island, the subsequent service and medical records of these men were examined for a 
period up to January 1, 1946 and a tabulation was made of all discharges under the 
categories neuropsychiatric, other medical, and bad conduct. The samples reported 
are slightly smaller than those originally selected owing to the loss of records at the 
Naval Records Office. A previous study, however, has shown that such shrinkage 
attributable to loss of records does not contribute any systematic bias®’. It must 


*This study is part of a larger project continuing under ONR contract 7onr-450(11) with North- 
western University. The opinions expressed, however, are those of the individual authors and do not 
represent the opinions or policy of the Naval service. 
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be remembered that the 1942 and 1943 samples are not directly comparable since 
changing manpower standards may have differentially affected the caliber of the 
men and the 1942 sample was studied for 3 years, as contrasted with 2 years for the 
1943 sample. 

RESULTS 


Table 1 gives the attrition or discharge figures in percentages for the different 
groups. The low intelligence group has a higher incidence of discharges than the 
controls in every category, with the single exception of “medical” in the 1943 sample. 
The cases of low intelligence plus other psychiatric symptomatology have higher 

TaBLE 1. PERCENTAGE OF DISCHARGES FOR NEUROPSYCHIATRIC, MEDICAL, AND DISCIPLINARY 


Reasons. (NP = Neuropsychiatric, Med. = other medical reasons, 
and BCD = bad conduct discharge) 








1942 % Discharged 1943 % Discharged 

Groups N NP Med. BCD N NP Med. BCD 
Control 322 2.1 1.9 0.0 275 Fe 1.8 0.0 
Low MA 165 7.9 3.1 9.1 67 7.5 1.5 3.0 


Low MA 
plus Psychiatric 157 | 12.1 10.2 10.2 | 2088 | 106 5.3 7.7 























discharge rates than the low intelligence only group in every category without ex- 
ception, and of course are thus higher than the controls as well. The incidence of 
discharges thus seems to rise through the marginally intelligent to those who are 
also handicapped by added adjustmental difficulties. The differences between 
controls and both experimental groups are statistically significant (5% level or 
better) with the exception of medical discharges for both the 1942 and 1943 low 
intelligence only groups. While the differences between low intelligence only and 
low intelligence plus psychiatric symptomatology are uniform, with the latter al- 
ways having the higher rate, the differences are statistically significant only for the 
category of medical discharges. The uniformity of the findings for the two separate 
samples from different years adds to the reliability of th study. An analysis also 
was made of the mean mental age within the various types of discharge. Within our 
limited samples the differences were neither significant nor consistent. 

The results of this study added to those from the two previously mentioned 
studies °: *) show clearly that the military potential of individuals of low intelligence 
is much less than that of individuals who are not so handicapped. The individual 
of low intelligence has less chance of successfully completing his enlistment, and if 
he does successfully complete it will still be more of a burden upon the services’ 
hospital and disciplinary facilities. This greater cost attendant upon the military 
utilization of those individuals of marginal intelligence must be considered in mil- 
itary manpower planning. 


SUMMARY 


Groups of individuals of low intelligence and of low intelligence plus psychiatric 
symptomatology foliowed through a period of military service showed higher dis- 
charge rates for neuropsychiatric reasons, other medical reasons, and bad conduct 
than did a group of ‘normal’ controls. 
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JUSTIFICATION AND COMMAND AS TECHNIQUES FOR 
HYPNOTICALLY-INDUCED ANTISOCIAL BEHAVIOR 


WILLIAM LYON 


University of Hawaii* 


PROBLEM 


The question of whether hypnotized persons can be induced to engage in anti- 
social behavior is still unsolved. Whether experimental results were positive or 
negative seems dependent upon (a) definition of ‘‘antisocial’’, (b) the operator’s 
attitude and (c) technique. In the present experiment, we defined an act as “harm- 
ful or antisocial or dangerous if damage is done or danger is run, whether or not the 
actor performs the act knowingly and willfully.”@ Our experimental design is 
intended to answer the following hypotheses: 1. Can antisocial behavior be induced 
in an hypnotized subject? 2. Are antisocial suggestions more likely to be complied 
with when the suggested behavior is justified to the subject rather than simply 
commanded? 


PROCEDURE 


The subjects in this experiment consisted of 43 university students, ranging 
from 18-year-old freshmen to 27-year-old seniors, selected on a random basis by 
taking every third name on the class roles and who were seen on two occasions prior 
to the experimental testing. Of this group, 20 subjects, all of whom manifested at 
least three deep-trance phenomena (one of which was the ability to open the eyes 
without awakening) were used. Again on a chance-selection basis, five males and 
five females were placed in Group C(ommand) and the same number in Group 
J(ustification). Each subject was hypnotized on the two sessions prior to the testing. 
They were told that they were being tested for perceptual discrimination. Of the 
total group tested, none had ever been hypnotized before. 

Ten Control group subjects were randomly selected from another of the writer’s 
classes. When suggesting the experimental tasks, the operator used the command 
technique with five subjects and the justification technique with the other five. 
The same three antisocial activities were suggested to all subjects: 


1. Prior to being hypnotized, the subject was shown a small bottle of sulfuric 
acid and the experimenter demonstrated its disintegrative power. The subject was 
then told to close his eyes in preparation for hypnosis. The writer then covertly 
switched bottles with 10 subjects. The second bottle was filled with water, which is 
distinguishable from sulfuric acid only by its lesser density. The discrimination is 
not one which can usually be made by laymen, either in terms of appearance or the 
total weight of the bottle. No subject, upon later questioning, stated that he had 
discerned that the bottles had been switched. 

With the other 10 subjects, the operator covertly screwed the cap back on the 
bottle. Thus, 10 subjects were requested to pour (unsuccessfully) what actually was 
acid, while 10 subjects were allowed to pour what they thought was acid. 

After this maneuvering of bottles had been completed, the subject was told to 
open his eyes and grasp the bottle in his right hand. He was then hypnotized. On 
previous occasions, the subjects were accustomed to this brief closing of the eyes 
prior to being put into a trance. After a deep trance had been attained, the subject’s 
task was to tilt his hand so that the acid would spill on the hand of an observer. 

2. After five minutes had elapsed, during which amnesia for the previous task 
and deeper sleep were suggested, the subject was told that the door to an office at 
the end of the hall was open. Inside the office, on the desk, there was a bound Manila 
folder. On it was printed ‘Confidential Report from Dr. Herbert Weaver, Chair- 


*Now at Florida State University, Tallahassee, Florida. 
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man of the Psychology Department”. The subject was told to open his eyes without 
awakening, bring the folder to the experimenter, and then tear it up. 


3. After another interval of suggested amnesia and sleep, the subject was told 
to hold the bottle in his hand and, upon the experimenter’s signal, pour it on his 
own left hand which was lying next to his right hand. 


After each task had been suggested to a subject, he was allowed a few minutes 
to mull over the suggestions. Following his acceptance or rejection of the three 
tasks, he was told that he would remember everything that had happened, what he 
had thought and how he had felt at all times during the trance period. 


METHODs OF PRESENTING EXPERIMENTAL TASKS 


Task I, Group C: The subject was simply told to pour the acid upon the ex- 
perimenter’s command. If he did not comply, the command was given in a slightly 
more forceful tone. If he still did not acquiesce, further relaxation and deeper sleep 
were suggested. If he carried out the order, the experimenter said, ‘‘Alright, that’s 
enough. I’ll wipe off the acid before it burns too deeply.” 

Task I, Group J: The experimenter said that he realized that the subject knew 
that the bottle really didn’t contain acid (meantime mentioning some of the “‘clues”’ 
that the subject had undoubtedly used). He then requested the subject’s assistance 
in measuring the reaction time of a more naive hypnotized subject who would enter 
the room. 

Task II, Group C: The subject was told to open his eyes and to bring the folder 
back to the experimenter’s office. If this was done, he was then ordered to tear up 
the folder and to place it among his own books. 


Task II, Group J: Under the guise of demonstrating his “hypnotic motor co- 
ordination”, the subject was sent to get the folder. If he did so, he was then asked 


to demonstrate his ‘“‘manual dexterity’”’ by taking only three pulls to tear the folder 
in half. 


Task III, Group C: In matter of fact terms, the subject was told to place his 
left hand next to his right hand and to pour the acid on it when so ordered. 

Task III, Group J: The subject was reminded that the liquid was not really 
acid, that it obv iously had not burned the observer, and then asked to demonstrate 
his faith in the experimenter by pouring the liquid on his own hand when asked 
to do so. 

RESULTS 


The experimental results are presented in table 1. The footnote refers to 
instances where, although the act was carried out, there were considerations which 
indicated that the command would not have been complied with if the individual 
subject had been more certain that the task was as dangerous as the operator had 
said that it was. Inasmuch as the subject’s explanations as to why he carried out 
the act might have constituted attempts to rationalize objectionable behavior, the 
“‘yes”’ responses were included. 


TaBLE 1. NuMBER oF ANTISOCIAL Acts CoMMITTED BY THREE GROUPS 








Task I Task II | Task ITI 
Group Yes No Yes No | Yes No 








Command (C) 5 5 2 8 | 4 6 





Justify (J) 9 1 10 0 10 














Control 0 10 3 7 1 9 




















*There were three questionable acts on Task I, one on Task II, and three on Task ITI. 
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Typical statements of Group C subjects indicated that if the subject complied 
with the command, he usually did so because he felt compelled to or because he felt 
the operator wouldn’t really let him do anything wrong. The attitude of those who 
refused was ‘‘Why should I?”. 

Some typical statements of Group J subjects were: “I really didn’t want to, but 
I believed you.” “I thought it might hurt him, but not seriously. But I thought 
this must be pretty important to you, so I took a chance.” “Well, you said it was 
alright to do it.” 

As the foregoing statements indicate, the suggested amnesia for the preceding 
task(s) was not in every case successful. There were times, however, when the writer 
felt that some subjects were using the post-trance recall to rationalize their experi- 
mental behavior. 

DIScUSSION 


Our experimental findings were that when 10 hypnotized persons were command- 
ed to do so, they committed a total of 11 antisocial acts (seven of which may be 
questioned) and refused to commit 19 other acts. Among a group of 10 other hyp- 
notized persons, whom the experimenter attempted to persuade (by justifying the 
suggestions) to perform the same tasks, 29 antisocial acts were committed, with one 
refusal. That the experimental tasks really were regarded as antisocial is attested to 
by the fact that a control group of 10 unhypnotized persons committed only four 
of the acts. 


CONCLUSIONS 
Under the conditions of this experiment, hypnotized persons may commit anti- 
social acts under the influence of suggestion. Antisocial acts were committed much 
more readily when the situation was so structured that the hypnotized subject could 
justify his behavior. 
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THE INTERACTION OF BACKGROUND AND CHARACTERS IN 
PICTURE TEST STORY TELLING 


SOL CHAREN 
Catholic University of America 


PROBLEM 


Subjects given the Make A Picture Test (MAPS) pick an environmental 
background as well as characters. The theorists ® ®) have emphasized that the stories 
told to picture tests must emphasize the relationship between the background setting 
and the behavior to be expected of the characters in such a setting. Schneidman, for 
example, believes that some of the MAPS background settings can be used to elicit 
problem areas such as drunkenness, sex “, »- 4), Whether this belief is true has not 
though been experimentally validated. 

What does the S study when he scans any picture? Does he use the whole pic- 
ture gestalt, or does he identify only with the characters or with an object in the 
picture which best suits projective needs? The purpose of this study is then to de- 
termine whether it is the background or the characters or both which influence the 
story told by an S to the MAPS test. 


a umbers and character descriptions are those given by Schneidman in his manual on the MAPS 
test (4 P- 160), 
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METHODS 


The Living Room and the Stage scenes of the MAPS test were selected as the 
backgrounds to be compared. Because the Living Room has an open door showing 
part of a street scene, this picture was redrawn by an artist so that it became an 
en interior with the door shown as closed, but all other details identical with the 
original. 

These five sets of characters were picked: (1) M 15* Man with right hand in 
pants pocket* and F 16, Woman bending over; arms up; apron. (2) M 13 Man with 
both hands folded in front of him, looking down and F 10, Old lady with shawl. 
(3) M 6 Gangster; man with gun and F 2, Female undressing. (4) I 2 Rear view 
of seated figure. (5) S 1 Solid black male silhouette. 

The S8’s were 25 male patients in a general hospital between the ages of 25 to 35 
years, selected by ward physicians as having only a minor disorder and presenting 
no picture of emotional complaints. The instructions given them were to tell a story 
to each background and to bring out what had happened in the past, the present 
and what would happen in the future, using the instructions given by Murray ®?. 
They were also told to give new stories when the same backgrounds or characters 
were re-presented, to ignore the fact that they had already used the characters or 
backgrounds for other stories. 

Each patient was tested individually with the characters presented in this 
sequence to the Stage and Living Room. 


Order Stage er 
M 15,F 6 
i 32 
ee | 
M 13, F 10 
M 6,F 2 10 


In this manner Stage and Living Room were interchanged for all characters. The 
first S began with sequence 1 and finished with 10, the second S with sequence 2 
and finished with 1 and so forth. A total of 250 stories were obtained, half to the five 
sets of characters in the Living Room, half to the same five on the Stage. Scoring of 
the elements in each story was by means of the scoring method of Ruben Fine. 


RESULTS 


Of the 24 attitudes of Fine which were scored, only one showed a significant 
change (P = 5%) when the backgrounds were varied. Three others were significant 
at the 10% level. Over half the attitudes showed almost no change in stories told 
when characters were identical and backgrounds were changed. Only four attitudes 
out of 24 changed significantly when backgrounds were varied, but three of these 
were at the 10% level of confidence. The 8’s then gave almost the same emotional 
attitudes to characters presented with the Living Room as to the same characters 
presented to the Stage. The data were also treated to show what changes tovk place 
when the backgrounds were treated as the independent variable. Chi sware results 
showed that stories changed significantly as characters were chane’ P> {°’ 


CONCLUSIONS 


The experimental evidence of this study suggests that the S’s make the charact- 
ers in the MAPS test the medium for projection and tend to disregard the back- 
ground. This would suggest that clinicians need to pay more attention to the char- 
acters selected by their patients and less emphasis made of the background setting 
used. Changing the backgrounds of MAPS test situations influences only slightly 
the emotional tone and attitudes of stories told by patients. Changing the characters 
does more to change or emphasize the emotional tone of the stories than does the 
change of backgrounds. 
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A STUDY OF EXAMINER INFLUENCE ON RESPONSES 
TO MAPS TEST MATERIALS 


ALICE VAN KREVELEN 
Hollins College 


PROBLEM 


In a recent study ° the writer found significant differences between profile series 
obtained when the Szondi Test was self administered and when it was administered 
by an examiner. These differences were attributed to subject-examiner interaction. 
The present study attempts to determine whether such effects can be demonstrated 
using MAPS Test materials. The questions investigated can be stated as follows: 
Were there significant differences between stories told to E and written by Ss when 
they were alone in regard to (1) number of figures used, (2) repetition of figures, (3) 
particular figures selected, (4) number of words per story, (5) emotional tone of the 
stories? 


METHOD 


Twenty normal adults between the ages of 19 and 30 years took part in the ex- 
periment. Two background pictures from the MAPS Test were used, the picture of a 
bare stage and the picture of a raft. Ss were instructed in the use of the MAPS 
materials and told two stories for each picture, one dictated to E and the other 
written after E had left the room, producing in all four stories. In this way a total of 
80 stories was obtained, 40 done in the presence of E and 40 when Ss were by them- 
selves. A particular pattern of alternation was used so that half of the stories told for 
each picture were first stories and half were second stories for the same background. 
This was done in an attempt to control effects of using the same picture for two 
stories. 


RESULTS 


There were no significant differences in the number of figures used or the parti- 
cular figures selected. The stories were rated for emotional tone according to the 
general rating scale categories used by Eron, Terry and Callahan”) with TAT 
stories. There was no significant difference in emotional tone between stories pro- 
duced alone and with E. 

When Ss used the more structured background (the raft) there was no significant 
difference in the number of words per story; however in the more unstructured sit- 
uation (the bare stage) a difference significant at the .04 level of confidence was 
found. Stories written about the picture of the bare stage were significantly longer 
than those told to E. 
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SUMMARY 


Twenty adults produced stories to two MAPS backgrounds, two of the stories 
were told to E and two written in the absence of E. Conclusions must be limited due 
to the fact that only two MAPS backgrounds were used. 

The fact that the stories written by Ss in the absence of E were significantly 
longer using the unstructured background may be worthy of some comment. If the 
assumption is made that the unstructured picture presented a greater challenge to 
the imagination it can be concluded that Ss in this study were able to think more 
creatively or at any rate create more elaborately in the absence of E. If the clinical 
value of the stories bears any direct relationship to the amount of verbal material 
contained it might prove rewarding to have Ss write their stories in the absence of 
the examiner in cases where such a procedure would be feasible. 

It is concluded that with the exception of the length of stories produced in 
response to the unstructured situation no subject-examiner effects were demonstrated 
for any of the formal aspects of the test which were investigated. 
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CONCERNING THE TRUTH VALUES OF CLINICAL STATEMENTS 
FRANK M. DU MAS 
Michigan State College 


There is some ambiguity regarding the appropriate frame of reference that 
should be employed in the evaluation of the degrees of truth associated with clinical 
statements about a patient. Some thinkers“: P- 17; 2, p. 158; 4, p. 58) believe that some 
clinical statements are certainly true or certainly false. Others®: »- *» imply that no 
such statement can be made with certainty. The problem is: how can one best eval- 
uate the truth of clinical propositions? 

All propositions are either formal (mathematics, logic) or empirical (statements 
about phenomena). The degree of truth of a proposition may be regarded as a pre- 
diction and, therefore, directly related to the probability continuum. All proposi- 
tions have a truth probability, P, which may be evaluated by a multivalue logic 
whose general model is 

<7 <4 i. (1) 


The limits of relation (1) are appropriate for the evaluation of formal proposi- 
tions, that is, 


P=0,1 (2) 
and these propositions are certainly true (T) or certainly false (F). 


Empirical propositions are evaluated by reference to the interval between the 
two limits, that is, 


0<P<1. (3) 


Since clinical propositions are empirical propositions, from relation (3) it follows 
that they can never be absolutely true or false. 
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In the practical world of affairs, the clinician often postpones judgement so 
that he can study data, consult with others, etc. Certain propositions about a pa- 
tient are doubtful (D). A useful 3-value logic is 


|F|D|T|. (4) 


The social matrix in which a clinician functions demands, sooner or later, that 
he arrive at decisions of an “‘either-or’’ nature. That is, a patient is or is not admitted 
to a hospital, a lobotomy is or is not performed, etc. Hence, a 2-value logic as 


| F | TI. (5) 


Relation (4) may be defined as three intervals of relation (2), and relation (5) 
defined as two intervals of relation (2). Relation (4) is not adequate for decision 
making which inaugurates a course of action when very close decisions are made 
because the D-interval may be too wide. 

Our analysis indicates that: (a) clinical propositions are empirical propositions 
which are best evaluated by reference to a multivalue logic, (b) when a clinician 
wishes to postpone judgement a 3-value is useful, (c) demands of society force the 
clinician to use a 2-value logic when deciding on a final course of action, (d) the 
truth categories (F, D, T) may be defined as intervals of the truth continuum, and 
(e) of the infinite number of possible logics, three are of special significance to the 
clinician: multivalue, 3-value, 2-value. 
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RELIABILITY, CHANCE, AND FANTASY IN INTER-JUDGE 
AGREEMENT AMONG CLINICIANS* 


WILLIAM A. HUNT, FRANKLYN N. ARNHOFF AND JOHN W. COTTON 


Northwestern University 


While this paper presents research data its primary purpose is pedagogical. 
Reliability is the Achilles heel of those clinical disciplines employing the intuitive 
judgmental process as an operating technique and hence it is of tremendous interest 
to clinical psychologists: Whenever a clinical study yields data which may bear 
upon the factor of reliability, such reliability is eagerly surveyed and reported, and 
always can count on a fascinated if not necessarily enthusiastic audience. For our 
purposes here, reliability will be defined as inter-judge agreement, in a judgmental 
situation closely approaching actual operating clinical practice. 

Our data were obtained from a previously reported study in which 60 clinical 
psychologists with four years or more on-the-job professional experience were 
given a group of 10 schizophrenic responses to items from the Wechsler-Bellevue 
and Terman-Binet vocabulary tests®’. They were then asked to rate each of the 
responses for the severity of the disorder in the thinking processes exhibited using 
an 11-point scale. As a measure of reliability or inter-judge agreement we correlated 
the rank order of the 10 stimuli for each judge with that of the rank order assigned 


*This study is part of a larger project continuing under ONR contract 7onr-450(11) with North- 
western University. The opinions expressed, however, are those of the individual authors and do not 
represent the opinions or policy of the Naval service. 
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by averaging the judgments of all 60 clinicians. While there is some contamination 
here, since each judge contributed to the group average, the proportion of 1-60 
renders this negligible. 

Brevity and economy in reporting usually dictate the use of some single measure 
of reliability, which in this case might well be the equivalent of an average r. For 
our purposes, however, it seems wiser pedagogically to present a complete table of 
all 60 rs. This appears in Table 1. Inspection immediately shows the wide range of 
rs from +.02 to +.93, with a modal clustering in the 60’s. This might be viewed 
as representing a true value in the 60’s with an error distribution about this point, 


TABLE 1. Rank OrnDER CORRELATIONS OF Eacu JuDGE’S RATINGS WITH AVERAGE RaTING 
oF 60 JuDGES 








.59 .64 .68 
.59 64 
.60 64 
61 .66 
61 66 
.62 .67 





or it might be viewed as a continuum of ability with individual clinicians distributed 
upon it. Actually, it must be both but it seems safe to assume that differences in 
ability are at least in part responsible for the distribution, and that in terms of the 
ability to make reliable judgments in the sense used here, clinicians vary tremendous- 
ly.. There would seem to be “‘good”’ clinicians and “bad” clinicians. While this fact 
is implicitly recognized among clinicians and occasionally reported in the litera- 
ture ©), it is seldom taken account of in either experimental designs involving clinical 
judgment“? or in actual clinical practice utilizing such judgments. Certainly this 
wide range of ability is concealed by the use of any single measure. 

To illustrate this, let us use such a single measure. We select Alexander’s r 
as a measure of the average r between pairs of judges’. When such an overall 
measure is applied to our data it comes out +.33. It is an honest measure and 
statistically justified, but in this case it conceals some very important information 
concerning the range of ability among clinicians, a fact which is evident if we consult 
Table 1. 

So far we have been considering “reliability.”’ Let us now consider “chance.” 
Since 60 clinicians is an unusually large sample for this type of study, we may feel 
secure. Most studies use many fewer subjects. Suppose we had had only 20 subjects 
in our group. We can attempt to answer this conjecture by splitting our group of 60 
randomly into three groups of 20 clinicians each. When we do this and apply the 
same measure, we find the following average rs— +.19, +.51, +.26. The range here 
is noticeable. In fact the r of +.51 is so far out of line as to establish the lack of 
homogeneity in this sampling, although it was achieved in random fashion. In terms 
of our total sample of 60 it is evident that a value of +.19 would underestimate 
“typical” clinical ability and +.51 would overestimate it. This demonstrates the 
part that chance (in sampling) may play in measuring inter-judge agreement among 
clinicians. 

Now let us go one step further. Let us assume either that our data are not 
amenable to treatment by the method of rank order correlation, or that such a 
stodgy, commonplace measure seems too pedestrian for our use. Then let us indulge 
in a little status-connected statistical and logical interpretation. We may say that 
unless the clinicians were showing some agreement between themselves the items 
judged would not be statistically significantly distinguished one from the other. A 
measure of the significance with which the items are discriminated would then be 
closely related to inter-judge agreement and might serve as an indirect measure 
thereof. This assumption is sensible and supposedly the resulting measure might 
be informative provided we kept firmly in mind how it had been derived. For such 
a purpose we decided to use Hoyt’s r which gives a measure of the reliability of the 
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average judgments for the several items“, When applied to our data, it gives an 


r of +.97. Its use for our purposes represents fantasy in inter-judge agreement 
among clinicians. 


REFERENCES 


A.ExanpDER, H. W. The estimation of reliability when several trials are available. Psycho- 
metrika, 1947, 12, 79-99. 

—r F. N. Some factors affecting the unreliability of clinical judgments. J. clin. Psychol., 

n press. 

Hoyt, C. J. Test reliability estimated by analysis of variance. Psychometrika, 1941, 6, 153-160. 
Key, E. L., and Fisxn, D. W. The prediction of performance in clinical psychology. Ann Arbor: 
University of Michigan Press, 1951. 

a . Clinical intuition and test scores as a basis for diagnosis. J. consult. Psychol., 1949, 





THE RELATIONSHIP OF THE ROSENZWEIG PF STUDY* 
TO THE MMPI 


HERBERT QUAY AND ANDERS SWEETLAND 
Milledgeville (Georgia) State Hospital Florida State University 


PROBLEM 


Previous studies ®: © led us to believe that Rosenzweig’s directions of expression 
of aggression“) were related to emotional adjustment. We concluded that Extra- 
punitiveness (E) and Intropunitiveness (I) were associated with maladjustment, 
while Impunitiveness (M) was linked with good adjustment. The present experi- 


ment is an attempt to check this hypothesis. 


METHOD 


Since the Rosenzweig Picture-Frustration Study “ was designed to measure the 
forementioned concepts, it was used for this purpose. The scores for the direction of 
expression of aggression (E, I, and M) were correlated with subscale scores on the 
Minnesota Multiphasic Personality Inventory (MMPI) used as a measure of 
emotional adjustment. 

The two tests were given to ninety-one college students taking classes in intro- 
ductory psychology. As this course is taken by 95 per cent of the students in this 
University, it was felt that the sample was adequately random. The scores from the 


two tests were correlated (Pearson r) after first converting raw MMPI scores to 
T scores. 


RESULTS 


The correlations between the two tests are found in Table 1. There is a slight 
tendency for the data to support our hypothesis, but it is not impressive. E and I 
seem to indicate maladjustment: the majority of the correlations being positive; M 
indicates adjustment: the majority of the correlations being negative. None of the 
correlations were high enough to have practical significance. We are amused that 
the paranoia scale of the MMPI correlates negatively with extrapunitiveness. That 
was the one correlation which we expected to be both high and positive. 

Although there seems to be a slight relationship between the MMPI and the 
PF, it is difficult to give much significance to our findings. Either the two tests are 
measuring in different areas (most likely), or one (or both) of them is measuring 
inadequately. It is our feeling that improvement of the reliability of the PF might 
yield more impressive correlations. 


*We are indebted to Miss Helen Hewlett for her invaluable clerical assistance. 
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Tasip 1. Corre.aTions BeTwEen THE RosEenzweia PF E, I anp M 
RES AND THE MMPI* Susscatze Scores 








MMPI Extra- Intro- Im- 
Subscales punitive?* punitive*® punitive 


F 0.16 0.12 0.11 
Ki 0.07 0.11 0.11 
Hs 0.14 0.03 -0.17 
Hy 0.20 0.19 0.03 
D 0.06 0.01 0.10 
Pd 0.12 0.06 — -0.09 
Pa 0.18 0.14 0.16 
Pt 0.21 0.03 0.26 
Se 0.23 0.08 -0.17 
Ma 0.21 0.06 0.14 
SI 0.21 0.12 0.13 

















*To be reliable at the .05 level the correlations must equal or exceed 0.205 


1. 5 8 of the K scale have been reversed to facilitate direct comparison. ‘* 
2. e chances of getting eleven of twelve correlations with the same signs is 0.005 
3. for getting seven of eleven is 0.134 

for getting nine of eleven is 0.27 


In this instance a one-tailed test might be justified, if so the indicated probabilities 
may be halved, i.e. 0.0025, 0.0670, 0.135. 


SUMMARY 


It was hypothesized that extrapunitiveness and intropunitiveness are correlated 
with emotional maladjustment and that impunitiveness is correlated with adjust- 
ment. To test this, ninety-one college students were given the Rosenzweig Picture- 
Frustration Test and the Minnesota Multiphasic Personality Inventory. There was 
a slight trend of the correlations in the predicted direction. However, all correlations 
were too low to have practical significance. It is suggested that the reliability of the 
Rosenzweig test be improved. 
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A NOTE ON SEX DIFFERENCES ON THE WECHSLER-BELLEVUE TEST 
H. A. GOOLISHIAN AND AUSTIN FOSTER 


University of Texas Medical Branch, Galveston, Texas 


INTRODUCTION 


Recent research indicates that there are significant sex differences on various 
sub-tests of the Wechsler-Bellevue Intelligence Scale, Form I“: *). Wechsler how- 
ever, points out in his manual that there are no statistically significant sex differ- 
ences despite the fact that females tend to have higher mean total scores at every 
age level. As Jastak“) has indicated, the problem of differential functioning on the 
Wechsler-Bellevue by males and females is an important issue since sex differences 
may distort interpretation of sub-test scatter. A disturbing feature of the research 
reported in this area is that there are considerable discrepancies. Strange and 
Palmer? report, for instance, that males are significantly higher on comprehension 
than are females. On the other hand, Jastak“? reports that females do better on this 
sub-test. In general then, data regarding sex differences on the Wechsler-Bellevue, 
while available, are such that more information is needed in order that a more valid 
determination may be made between population differences and true sex differences. 


SAMPLE AND PROCEDURE 


The present sample consisted of 392 white psychiatric patients, 190 males and 
202 females to whom the Wechsler-Bellevue, Form I, had been given as part of a 


larger diagnostic battery. All records within a given time period were utilized with 
the exception of those patients having known organic brain pathology. No data 
are available on object assembly, a sub-test which is not routinely given at this 
clinic. 

The means and standard deviations of the weighted scores for each sub-test 
were computed along with the means and standard deviations for the IQ’s, age, and 
education. The differences between the means of the males and females were com- 
pared and the significance tested by Student’s ¢-test. 


RESULTS AND DIscussION 


The statistical findings are reported in table 1. If the .05 level or better is taken 
as significant then the males are superior in functioning on comprehension, arith- 
metic, digit span, picture completion and block design, as well as verbal, perform- 
ance, and full scale IQ’s. There are no statistically significant differences in either 
age or education. In comparing these results with those of Jastak and Strange and 
Palmer, it can be seen that there is total agreement in the finding that males are 
superior on arithmetic, digit span, and picture completion. That is, in three inde- 
pendent researches these three sub-tests have been found to differ in favor of the 
males. If agreement two times out of three is considered, then all tests with the ex- 
ception of similarities, digit symbol, and vocabulary have been found to differ sig- 
nificantly between the sexes. While it may be possible to argue that these differences 
are due to sampling error, such as deviations due to the nature of the psychiatric 
populations, it would seem more parsimonious to conclude that there is at least pre- 
sumptive evidence that the differences are truly due to the differentia! functioning 
of the two sexes. Particularly so, perhaps, in those tests where there is unanimous 
agreement as to the direction and significance of «the differences. The warning by 
Jastak and Strange and Palmer, that this factor should be taken into account in 
research and clinical use of this instrument seems well taken. 
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TABLE 1. CoMPARISON OF MALES AND FEMALES ON THE WECHSLER-BELLEVUE SusB-TEstTs, IQ’s, AND 
EpvucaTION 








Wechsler-Bellevue 
Sub-tests 


Males N = 190 


Females N = 202 


Males vs Females 





Mean 8.D. 


Mean 8.D. 


t P 





Information 
Comprehension 
Digit Span 
Arithmetic 
Similarities 
Vocabulary 


Picture Arrangement 


Picture Completion 


Block Design 
Digit Symbol 
Verbal I. Q. 
Performance I. Q. 
Full Scale I. Q. 
Age 


Education 





10.29 3.31 
10.61 3.24 
8.35 3.11 
9.67 4.83 
10.57 3.53 
10.37 3.34 
9.65 3.31 
9.94 3.55 
10.19 3.03 
9.12 3.07 
106.42 19.16 
103.76 17.37 
105.63 19.08 
30.52 10.97 
11.46 3.81 





9.30 2.69 
9.96 3.01 
7.49 3.15 
7.22 3.62 
10.37 3.11 
10.05 2.98 
9.09 3.23 
9.12 3.44 
9.39 
9.67 

100.59 

99.99 

100.42 

29.70 

11.73 








Srrancg, F. B., and Patmer, J. L 


SUMMARY 


The sub-test scores and IQ’s of the Wechsler-Bellevue Form I for 202 females 
and 190 males were compared. Eight of the comparisons showed significant sex 
differences at the five percent level or better. This evidence supports the findings of 
Jastak and Strange and Palmer that sex differences on the Wechsler-Bellevue sub- 
tests do exist and that this fact should be taken into account in research or clinical 
work with this instrument. 
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EDITORIAL OPINION 





PROFESSIONAL “BLACK SHEEP” 


Every clinical profession is faced with the problem of dealing with a fringe of 
“black sheep” with marginal educational and experiential backgrounds who persist 
in attempting to practice in the field. These people constitute an important out- 
group whose activities cannot be ignored but usually cannot be controlled because 
of their ineligibility for membership in the organizations which police the profession. 
Clinical psychology has its share of these problems which are not basically different 
from those found in medicine. Any full consideration of the problem must consider 
the sources of these cases, as well as the optimum ways of handling them once they 
appear. 

First, it should be recognized that many of these “‘black sheep’”’ were made and 
not born. The modern ‘‘convoy” system of graduate education with its rather rigid 
selection procedures and regimentations must inevitably produce numbers of dissent- 
ers and deviants whose individual needs may bring them into conflict with estab- 
lished procedures. Too often we regard as ideal the conformist student who laps up 
academic pap docilely. And sadly enough, there are many brilliant students whose 
original thinking and protesting attitudes so challenge the mediocrity of their in- 
structors as to stimulate rejection reactions. Indeed, many of the most original 
thinkers in history have been rejected by their contemporaries much less being given 
high academic positions. A good example of this includes the discoverers and leaders 
of the psychoanalytic movement wherein Freud, Adler, Jung and others have be- 
come latter day saints only since it has become fashionable to court the psychiatrists. 
Undoubtedly the most original thinker in the whole history of psychology, Freud 
never received the recognition he deserved from contemporary academicians. An- 
other source of ‘“‘black sheep” comes from those who do not happen to have the exact 
professional background for licensure and certification and who are therefore ex- 
cluded from professional standing and memberships. The issue is competence and 
not the road whereby it is attained. Our field is so new that it is entirely possible for 
the most competent clinician to be self-taught or to receive training along many 
different patterns. 

As a corollary to this first proposition that “black sheep” are often made and 
not born is the postulate that with current admittedly-inadequate selection methods, 
it is inevitable that a certain percentage of all students accepted in professional 
training programs will become casualties or black sheep. The medical profession 
has established a precedent in its effort to rehabilitate and assimilate such cases 
within the established regulations and procedures of professional societies. It is 
better to accept these people and patiently labor to rehabilitate them. This can be 
accomplished by therapeutic rather than punitive techniques. Instead of summoning 
them as truants before an ethics committee, or otherwise denouncing them to the 
world, how much better to organize informal counseling to help them face the situa- 
tion in a friendly way. Current methods of handling such problems are too all-or- 
none, implying that either you conform to our standards or you are no good. There 
is enough work to be done to go all around. Let us not disgard potential sources of 
labor but instead attempt to place each one at his level of competency. 

A good example of how professional black sheep are made may be cited from a 
case history known to us. XY was a student with all the intellectual qualities 
requisite for acceptance as a professional student but with an unattractive person- 
ality characterized by physical ugliness and a tendency to go against people unless 
they accepted him immediately. XY was accepted on a master’s level of training but 
rejected on his application for doctorate training. He brooded and became slightly 
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paranoid over this action but insisted on being admitted to extension courses. Re- 
peated attempts on his part to be admitted to the in-group were rebuffed. He began 
to feel discriminated against and at times was quite vociferous in demanding his 
rights in a democracy. He joined leftist organizations and became known as a radical 
which still further handicapped his efforts to secure academic and professional recog- 
nition. He is now operating as a “lone wolf,’’ bitterly critical of established author- 
ity, and considered as a renegade by professional groups. Essentially this is a story 
of psychologists who failed to understand and heal one of their number. 

Second, the profession should assume full responsibility for either assimilating 
all those who are competent regardless of background of training, or else to rigorous- 
ly limit training opportunities to those who are selected as being of true professional 
calibre. The universities and the profession should be held completely responsible 
for present practices which permit those who are rejected from established training 
programs to get a “bootleg” education via unrecognized extension courses and other 
training situations which are open to all who have the matriculation fee. It is cur- 
rently true that anyone with a B. A. degree can shop around and take all the courses 
leading to masters or doctorate degrees without benefit of enrollment in any formal 
training program. As long as the universities are so liberal in opening advanced 
courses to all who are able to pay the fee, it is inevitable that many self-anointed 
practitioners will appear. 


Finally, there is the problem of how to handle people once they appear. It is an 
interesting thing that although clinical psychologists and psychiatrists usually in- 
clude the healing attitude in their self-concepts, they do not always maintain a 
therapeutic attitude in dealing with members of out-groups. Broadly speaking, 
there are two main avenues of approach to the problem of the black sheep. The older 
and less successful approach is to rigorously reject, ignore, exclude, punish and perse- 
cute them for their brashness and deviancies. This method will only succeed in driv- 
ing them further from the fold. The more therapeutic method is to accept their 
existence, recognize their needs, befriend and gain their confidence, counsel and guide 
them in ways for qualifying for accepted standards, and otherwise help them to live 
up to existing professional standards and competencies. Undoubtedly many black 
sheep can be rehabilitated and accepted into the in-group once their needs are under- 
stood and remedied. They are not renegades at heart, but usually frustrated people 
who wish to conform and be accepted, but who typically become rebellious when 
they are ostracized and otherwise discriminated against. Of course these remarks 
do not apply to outright charlatans and psychopathic quacks but only to sincere but 
misguided people. If you can’t coerce them, help them join us. on 
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Taytor, W. S. Dynamic and Abnormal Psychology. New York: American Book 
Company, 1954. Pp. 658. $5.50. 


The author is professor of psychology at Smith College with an established 
reputation for his scholarly writing in the field of abnormal psychology. This book 
summarizes, sometimes in overly academic fashion, a rich collection of the theories 
and clinical observations of classical abnormal psychology. Along with many other 
compendious books, students may find this a little difficult to study due to the neces- 
sity of mastering a large number of definitions and theoretical concepts which only 
an advanced student could assimilate and relate each in its proper perspective. It is 
not always easy to differentiate where basic science psychology ends and speculative 
philosophizing starts since references as far back as 1910 are given the same prom- 
inence as those of the last decade in relation to such topics as ‘‘varieties of will’, 
“freedom” and “responsibility”. All in all, an interesting book, but not one that 
should be blindly assigned to beginning students. 
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New York: American Orthopsychiatric Association, 1953. Pp. 368. $5.00. 


This is a collection of case studies originally presented at Workshops held at 
annual meetings of the American Orthopsychiatric Association and available only 
to relatively small groups of members. The popularity of the Workshop series led 
to the decision to publish thirteen of the most significant case presentations. This 
material will serve as a valuable source of clinical case studies for students. 


VeppeErR, CiypE B. (Ep). The Juvenile Offender. New York: Doubleday, 1954. Pp. 
510. $6.00. 


This is a valuable source book arranged so as to provide introductory comments 
by the editor and selected readings by various authorities on all aspects of juvenile 
delinquency. The topics are exceedingly well chosen and the individual contributions 
are all of high quality and very readable. 
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therapy. 
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_ _A translation by David Rapaport of Schilder’s Medizinische Psychologie, orig- 
inally published in 1924. 


Frevup, Sicmunp. On Aphasia. New York: International Universities Press, 1953. 
Pp. 105. $3.00. 


Translated by E. Stengel, this little known work by Freud was first published 
in German in 1891. 


Perry, RautpH Barton. Realms of Value. Cambridge, Mass.: Harvard University 
Press, 1954. Pp. 497. $7.50. 


A critique of human civilization by Harvard’s Pierce professor of philosophy. 
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York: Jewish Occupational Council, 1841 Broadway, New York 23, N. Y. 
Pp. 32. $1.00. 


June, C. G. Psychological Reflections. New York: Pantheon Books, 1953. Pp. 342. 
$4.50. 


This is an anthology of the writings of C. G. Jung, selected and edited by 
JOLANDE JACOBI as volume XXXI of the Bolligen Series. It consists of a collection 
of philosophical opinions on the nature and activity of the psyche, Man in his relation 
to others, the world of values, and on ultimate things. 


Harms, Ernest. Essentials of Abnormal Child Psychology. New York: Julian Press, 
Pp. 265. $5.00. 


Dr. Harms, who is well known as editor of the Nervous Child, gives us a survey 
of 25 years of experience and research in child abnormal psychology. 


SARTRE, JEAN-Pauu. Existential Psychoanalysis. New York: Philosophical Library, 
1953. Pp. 275. $4.75. 


Two essays from Sartre’s Being and Nothingness attempt to present the basis 
for a universal ontology of people and things. Impressionism may have a place in 
modern art but in science it is an anachronism. Sartre views the “human condition” 
as involving two kinds of being. ‘‘Being-in-itself is a plentitude or fulness character- 
ized by impermeability and infinite density” while ‘“Being-for-itself is to be what 
it is not and not to be what it is.... Man’s.. . nature is never to-be but always 


to-be-about-to-be”. Unfortunately it is not clearly stated in which viscera the pleni- 
tude of fulness exists. 


LANGFELD, H. S. et al (Eps.) A History of Psychology in Autobiography. Vol. IV. 
Worcester, Mass.: Clark University Press, 1952. Pp. 356. $7.50. 


Another of the valuable autobiographical series of famous psychologists in- 
cluding contributions by Bingham, Boring, Burt, Elliott, Gemelli, Gesell, Hull, 
Hunter, Katz, Michotte, Piaget, Pieron, Thomson, Thurstone and Tolman. Clini- 
cians will be interested in the fact that E. G. Boring, the only one psychoanalysed, is 
the only one who reveals much about the dynamisms of his inner self. 


Oscoop, CHARLES E. Method and Theory in Experimental Psychology. New York: 
Oxford University Press, 1953. Pp. 800. $10.00. 


Dr. Osgood is associate professor of Psychology at the University of Illinois. 
Although the title would appear to cover all of experimental psychology, the four 
sections of this book deal only with sensory processes, perceptual processes, learning 
and symbolic processes. The book is scholarly and comprehensive. 


WuitenHorn, J. C. et al (Eps.) The Psychiatrist: His Training and Development. 
Washington: American Psychiatric Association, 1953. Pp. 214. 


A report of the 1952 conference on psychiatric education held at Ithaca, N. Y. 


under the auspices of the American Psychiatric Association and the Association of 
American Medical Colleges. 


Norcutt, Bernarp. The Psychology of Personality. New York: Philosophical 
Library, 1953. Pp. 259. $4.75. 


A series of essays presenting a critical overview of the contributions of the 
various schools of psychology to the nature of personality. 


Yost, O. R. What You Should Know About Mental Illness. New York: Exposition 
Press, 1953. Pp. 165. $3.50. Elementary lectures for laymen. 
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Wotrruem, Neutuy. Psychology in the Nursery School. New York: Philosophical 
Library, 1953. Pp. 144. $3.75. 
Reports on years of experience in operating a nursery school on Freudian theo- 

retical backgrounds. 

Motoney, J. C. Understanding the Japanese Mind. New York: Philosophical 
Library, 1954. Pp. 252. $3.50. 


A psychiatrically-oriented anthropological study of Japanese culture and per- 
sonality development. Japanese culture emphasizes the insignificance of the individ- 
ual in striving for the goal of uniform conformity of behavior. Japanese society 
is organized to a degree hardly found elsewhere and in which there is a required 
time and place for everything. 


Stott, D. H. Saving Children from Delinquency. New York: Philosophical Library, 
1953. Pp. 266. $4.75. 


An English educator discusses the problem of juvenile delinquency. 


Burrow, TriGANnt. Science and Man’s Behavior. New York: Philosophical Library, 
1953, Pp. 564. $6.00. 


This volume consists of two parts. The second part includes the complete text 
of The Neurosis of Man, Dr. Burrow’s last book in which he further developed his 
concepts of phylobiology and phyloanalysis. In 1948, during a delay in publishing 
this work, Dr. Burrow conceived the idea of writing to 29 men of science inviting 
them to comment on his views. The resulting correspondence, together with inter- 
pretive comments, is presented in part one. 
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Lectures given under the auspices of the Department of Psychology, 
University of Kentucky, by Donald K. Adams, R. B. Ammons, John M. 
Butler, Raymond B. Cattell, Harry F. Harlow, Norman R. F. Maier, 
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field may further understanding in the others. 7954. 164 pages. 3.50. 


FUNDAMENTALS OF 
PSYCHOANALYTIC TECHNIQUE 


By the late TRYGVE BRAATOY, M. D., formerly of the Topeka 
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analysis. The author ‘points out that while the patients’ experience is 
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lar value because so much of the literature in this field has been concerned 
with theory rather than concrete description. 7954. 404 pages. $6.00. 
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of Education with Therapy 





XCELLENT professional facilities and 

long experience have equipped Devereux 
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therapy to the boy or girl whose emotional 
disturbances block his normal ability to 
learn. 
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